PepPSy: a web server to prioritize gene products in experimental and biocuration workflows Open Access

List of the ten PE5 entries from neXtProt release 2014-09-19 that have expression information in the four transcriptomic datasets

	neXtProt accession	Gene symbol	Observability status	UniGene	Affymetrix U133 Plus 2.0	Affymetrix All Exon 1.0	Hum. Protein Atlas RNA-seq
1	NX_B1AH88	TSPO	observable	Intestine > Brain > …	Mouth > Bone marrow > …	Spleen > Intestine > …	Bone marrow > Skin > …
2	NX_Q9NPU4	C14orf132	with special handling	Brain > Testis > …	Brain > Spinal cord > …	Brain	Brain > Oviduct > …
3	NX_P0CF97	FAM200B	likely unobservable	Brain > Eye > …	Male germ cell > Brain > …	Brain	Ovary > Kidney > …
4	NX_Q9BWV7	TTLL2	likely unobservable	Testis > …	Male germ cell > Testis	Testis	Testis
5	NX_Q5T036	FAM120AOS	likely unobservable	Brain > Lung > …	Placenta > Pituitary gland > …	Intestine > Kidney > …	Placenta > Thyroid > …
6	NX_Q96SF2	CCT8L2	likely unobservable	Testis > Brain > …	Male germ cell > Testis	Intestine > Pancreas > …	Testis
7	NX_Q8IVY1	C1orf210	with special handling	Pancreas > Intestine > …	Female germ cell	Intestine > Kidney > …	Intestine > Stomach > …
8	NX_P0CB46	CASP16	likely unobservable	Spleen > Uterus > …	Female germ cell	Intestine	Intestine
9	NX_Q8N5Q1	FAM71E2	observable	Testis > Thymus > …	Male germ cell	Testis	Testis
10	NX_Q96HZ7	C21orf119	likely unobservable	Intestine > Prostate > …	Testis > Male germ cell > …	Testis > Kidney > …	Skeletal muscle > Thyroid > …

	neXtProt accession	Gene symbol	Observability status	UniGene	Affymetrix U133 Plus 2.0	Affymetrix All Exon 1.0	Hum. Protein Atlas RNA-seq
1	NX_B1AH88	TSPO	observable	Intestine > Brain > …	Mouth > Bone marrow > …	Spleen > Intestine > …	Bone marrow > Skin > …
2	NX_Q9NPU4	C14orf132	with special handling	Brain > Testis > …	Brain > Spinal cord > …	Brain	Brain > Oviduct > …
3	NX_P0CF97	FAM200B	likely unobservable	Brain > Eye > …	Male germ cell > Brain > …	Brain	Ovary > Kidney > …
4	NX_Q9BWV7	TTLL2	likely unobservable	Testis > …	Male germ cell > Testis	Testis	Testis
5	NX_Q5T036	FAM120AOS	likely unobservable	Brain > Lung > …	Placenta > Pituitary gland > …	Intestine > Kidney > …	Placenta > Thyroid > …
6	NX_Q96SF2	CCT8L2	likely unobservable	Testis > Brain > …	Male germ cell > Testis	Intestine > Pancreas > …	Testis
7	NX_Q8IVY1	C1orf210	with special handling	Pancreas > Intestine > …	Female germ cell	Intestine > Kidney > …	Intestine > Stomach > …
8	NX_P0CB46	CASP16	likely unobservable	Spleen > Uterus > …	Female germ cell	Intestine	Intestine
9	NX_Q8N5Q1	FAM71E2	observable	Testis > Thymus > …	Male germ cell	Testis	Testis
10	NX_Q96HZ7	C21orf119	likely unobservable	Intestine > Prostate > …	Testis > Male germ cell > …	Testis > Kidney > …	Skeletal muscle > Thyroid > …

The list has been prioritized using the default parameters of PepPSy. The four last columns show the two tissues in which the highest expression levels have been reported in each dataset.

Table 1.

List of the ten PE5 entries from neXtProt release 2014-09-19 that have expression information in the four transcriptomic datasets

	neXtProt accession	Gene symbol	Observability status	UniGene	Affymetrix U133 Plus 2.0	Affymetrix All Exon 1.0	Hum. Protein Atlas RNA-seq
1	NX_B1AH88	TSPO	observable	Intestine > Brain > …	Mouth > Bone marrow > …	Spleen > Intestine > …	Bone marrow > Skin > …
2	NX_Q9NPU4	C14orf132	with special handling	Brain > Testis > …	Brain > Spinal cord > …	Brain	Brain > Oviduct > …
3	NX_P0CF97	FAM200B	likely unobservable	Brain > Eye > …	Male germ cell > Brain > …	Brain	Ovary > Kidney > …
4	NX_Q9BWV7	TTLL2	likely unobservable	Testis > …	Male germ cell > Testis	Testis	Testis
5	NX_Q5T036	FAM120AOS	likely unobservable	Brain > Lung > …	Placenta > Pituitary gland > …	Intestine > Kidney > …	Placenta > Thyroid > …
6	NX_Q96SF2	CCT8L2	likely unobservable	Testis > Brain > …	Male germ cell > Testis	Intestine > Pancreas > …	Testis
7	NX_Q8IVY1	C1orf210	with special handling	Pancreas > Intestine > …	Female germ cell	Intestine > Kidney > …	Intestine > Stomach > …
8	NX_P0CB46	CASP16	likely unobservable	Spleen > Uterus > …	Female germ cell	Intestine	Intestine
9	NX_Q8N5Q1	FAM71E2	observable	Testis > Thymus > …	Male germ cell	Testis	Testis
10	NX_Q96HZ7	C21orf119	likely unobservable	Intestine > Prostate > …	Testis > Male germ cell > …	Testis > Kidney > …	Skeletal muscle > Thyroid > …

	neXtProt accession	Gene symbol	Observability status	UniGene	Affymetrix U133 Plus 2.0	Affymetrix All Exon 1.0	Hum. Protein Atlas RNA-seq
1	NX_B1AH88	TSPO	observable	Intestine > Brain > …	Mouth > Bone marrow > …	Spleen > Intestine > …	Bone marrow > Skin > …
2	NX_Q9NPU4	C14orf132	with special handling	Brain > Testis > …	Brain > Spinal cord > …	Brain	Brain > Oviduct > …
3	NX_P0CF97	FAM200B	likely unobservable	Brain > Eye > …	Male germ cell > Brain > …	Brain	Ovary > Kidney > …
4	NX_Q9BWV7	TTLL2	likely unobservable	Testis > …	Male germ cell > Testis	Testis	Testis
5	NX_Q5T036	FAM120AOS	likely unobservable	Brain > Lung > …	Placenta > Pituitary gland > …	Intestine > Kidney > …	Placenta > Thyroid > …
6	NX_Q96SF2	CCT8L2	likely unobservable	Testis > Brain > …	Male germ cell > Testis	Intestine > Pancreas > …	Testis
7	NX_Q8IVY1	C1orf210	with special handling	Pancreas > Intestine > …	Female germ cell	Intestine > Kidney > …	Intestine > Stomach > …
8	NX_P0CB46	CASP16	likely unobservable	Spleen > Uterus > …	Female germ cell	Intestine	Intestine
9	NX_Q8N5Q1	FAM71E2	observable	Testis > Thymus > …	Male germ cell	Testis	Testis
10	NX_Q96HZ7	C21orf119	likely unobservable	Intestine > Prostate > …	Testis > Male germ cell > …	Testis > Kidney > …	Skeletal muscle > Thyroid > …

The list has been prioritized using the default parameters of PepPSy. The four last columns show the two tissues in which the highest expression levels have been reported in each dataset.

The first entry on this list (NX_B1AH88) corresponds to a very unusual annotation case in UniProtKB/Swiss-Prot (hence neXtProt). Although the usual UniProtKB/Swiss-Prot procedure is to merge all the splice isoforms that arise from one gene into a single UniProtKB/Swiss-Prot entry, this particular isoform has been annotated as a separate entry because it results from another reading frame and does not share any sequence with the other isoform (NX_P30536) (26). Because NX_B1AH88 and NX_P30536 are potentially transcribed from the same gene, they are mapped to the same Ensembl gene identifier (ENSG00000100300) and inherit any transcriptomics data linked to this gene identifier. Therefore, the data that was retrieved for the dubious NX_B1AH88 isoform is probably an artefact and would need to be remapped to the well-known isoform (NX_P30536, PE1).

The second entry from the list, NX_Q9NPU4, corresponds to the C14orf132 gene. There is a complete consensus across the four transcriptomics datasets showing that C14orf132 is expressed at highest levels in the brain. Initially, the annotation resources had predicted that this gene would encode a 173 aa protein. After reexamination, they chose another reading frame, resulting in a 83 aa transmembrane protein, that would be conserved in most mammalian species. The sequence has been changed in UniProtKB (04-MAR-2015 release) and neXtProt (2015-04-28 release). Given the available transcriptomics data, one should look for it in brain samples, if possible after membrane enrichment. Since trypsin cleavage would lead to a single, hydrophobic peptide, a special methodology may be required for its detection. Until a conclusive proof for its existence at protein level is established, the status of the entry has been changed to PE3 (validated by homology).

For two other entries, NX_Q8N5Q1 (FAM71E2, line 9) and NX_Q9BWV7 (TTLL2, line 4), there is also a clear consensus among the four transcriptomics datasets, indicating highest expression levels in testis. Because cDNAs for FAM71E2 have been found in different tissues (thymus, testis and brain), the status of the entry has been changed to PE2 (validated at transcript level) in UniProtKB (04-MAR-2015 release). Since two unique peptides corresponding to FAM71E2 were identified by mass spectrometry in sperm (27), neXtProt reclassified the corresponding entry as PE1. The TTLL2 gene has a mouse ortholog, cDNAs have been found in testis, and the CCDS consortium has recently reclassified it as protein-coding. Therefore, NX_Q9BWV7 is currently under examination by UniProtKB/Swiss-Prot and neXtProt curators for upgrading to PE2 (validated at transcript level). The peptide GGLDAPDCLPYDSLSFTSR, which uniquely maps on the corresponding NX_Q9BWV7 entry has been identified in testis by mass spectrometry. Since a single peptide is not sufficient to upgrade a protein to PE1 (validated at protein level), targeted LC-SRM studies on testis, using other peptides will need to be performed.

For NX_Q96SF2 (CCT8L2, line 6), three datasets out of four show highest expression levels in testis. The CCT8L2 gene, only found in Human and Chimp, is thought to have arisen by duplication in the Hominoidea lineage after its divergence from the Cercopithecidae (28). Although some mass spectrometry information is available (20), it has not passed the stringent criteria quality of the HPP for validation. Until we gain a more conclusive proof for its existence at protein level, the status of the entry has been changed to PE2 (validated at transcript level). Given the available transcriptomics data, it would be wise to look for this protein in testis-related samples. However, its definitive validation by mass spectrometry may not be an easy task since it differs from its closest paralog CCT8L1P by only a few residues. This is probably why its observability status has been set as ‘likely unobservable’ by Farrah et al. (18).

For NX_Q8IVY1 (C1orf210, line 7), two of the datasets indicate highest expression levels in intestine. C1orf210 has a clear ortholog in mouse (2610528J11Rik, UniProtKB Q9CQM1) and has been identified by mass spectrometry in fetal liver, pancreas, prostate (29), breast (30) and ovary (31). It has been shown to be phosphorylated on Tyr-94 in human cell lines of various origins (32). Therefore, its status has been changed to PE1 (validated at protein level).

For NX_P0CF97 (FAM200B, line 3), the datasets indicate expression in various tissues, including brain. FAM200B is well conserved among mammals, and the CCDS consortium has recently classified it as protein coding. Although some mass spectrometry information is available (20), it has not passed the stringent criteria quality of the HPP. Until a more conclusive proof for its existence at protein level is available, its status has been changed to PE3 (validated by homology). However, given the high sequence similarity with FAM200A, a conclusive proof for its existence at protein level will be hard to find.

For NX_Q5T036 (FAM120AOS, line 5), two of the datasets indicate expression in placenta. It is classified as protein coding by the CCDS consortium, is conserved in several mammalian species, but has not been detected by mass spectrometry yet. Given the available transcriptomics data, it would be wise to look for this protein in placenta. Until a conclusive proof for its existence at protein level is available, this entry is currently under examination by UniProtKB/Swiss-Prot curators for upgrading to PE3 (validated by homology) or 4 (predicted).

For NX_Q96HZ7 (C21orf119, line 10), two datasets show expression in testis. However, C21orf119 is predicted to encode a long non-coding RNA. Therefore, NX_Q96HZ7 will remain PE5 until contradictory evidence is available.

For NX_P0CB46 (CASP16, line 8), two datasets show expression in intestine. However, a consensus has been reached between the different resources to classify it as a pseudogene (not protein coding). Therefore, NX_P0CB46 has been deleted from UniProtKB/Swiss-Prot and neXtProt.

In conclusion, the use of PepPSy as a biocuration companion tool has allowed to quickly prioritize 10 PE5 proteins for re-annotation by biocurators, among a total of 616 proteins. As shown in Table 2, two of them have been validated at protein level (C1orf210 and FAM71E2), and five have been or will be upgraded to PE1-3. For four of them, further unambiguous proof of their existence needs to be found by mass spectrometry or Ab-based proteomics. Table 2 indicates in which sample these proteins could be investigated.

Table 2.

Summary of the reannotation of PE5 entries

neXtProt accession	Gene symbol	New PE	Sample suggestion for further analyses
NX_B1AH88	TSPO	PE5
NX_Q9NPU4	C14orf132	PE3	Brain; membrane fraction
NX_P0CF97	FAM200B	PE3	Brain; will be difficult to distinguish from FAM200A
NX_Q9BWV7	TTLL2	PE1? (in progress)	Testis
NX_Q5T036	FAM120AOS	PE3? (in progress)	Placenta?
NX_Q96SF2	CCT8L2	PE2	Testis; will be difficult to distinguish from CCT8L1P
NX_Q8IVY1	C1orf210	PE1
NX_P0CB46	CASP16	Deleted
NX_Q8N5Q1	FAM71E2	PE1
NX_Q96HZ7	C21orf119	PE5
NX_Q8IYS8	BOD1L2	PE2	Testis
NX_P0CG32	ZCCHC18	PE3	Brain?; will be difficult to distinguish from ZCCHC12
NX_C9J798	RASA4B	PE3	Skeletal muscle; quasi undistinguishable from RASA4

neXtProt accession	Gene symbol	New PE	Sample suggestion for further analyses
NX_B1AH88	TSPO	PE5
NX_Q9NPU4	C14orf132	PE3	Brain; membrane fraction
NX_P0CF97	FAM200B	PE3	Brain; will be difficult to distinguish from FAM200A
NX_Q9BWV7	TTLL2	PE1? (in progress)	Testis
NX_Q5T036	FAM120AOS	PE3? (in progress)	Placenta?
NX_Q96SF2	CCT8L2	PE2	Testis; will be difficult to distinguish from CCT8L1P
NX_Q8IVY1	C1orf210	PE1
NX_P0CB46	CASP16	Deleted
NX_Q8N5Q1	FAM71E2	PE1
NX_Q96HZ7	C21orf119	PE5
NX_Q8IYS8	BOD1L2	PE2	Testis
NX_P0CG32	ZCCHC18	PE3	Brain?; will be difficult to distinguish from ZCCHC12
NX_C9J798	RASA4B	PE3	Skeletal muscle; quasi undistinguishable from RASA4

Table 2.

Summary of the reannotation of PE5 entries

neXtProt accession	Gene symbol	New PE	Sample suggestion for further analyses
NX_B1AH88	TSPO	PE5
NX_Q9NPU4	C14orf132	PE3	Brain; membrane fraction
NX_P0CF97	FAM200B	PE3	Brain; will be difficult to distinguish from FAM200A
NX_Q9BWV7	TTLL2	PE1? (in progress)	Testis
NX_Q5T036	FAM120AOS	PE3? (in progress)	Placenta?
NX_Q96SF2	CCT8L2	PE2	Testis; will be difficult to distinguish from CCT8L1P
NX_Q8IVY1	C1orf210	PE1
NX_P0CB46	CASP16	Deleted
NX_Q8N5Q1	FAM71E2	PE1
NX_Q96HZ7	C21orf119	PE5
NX_Q8IYS8	BOD1L2	PE2	Testis
NX_P0CG32	ZCCHC18	PE3	Brain?; will be difficult to distinguish from ZCCHC12
NX_C9J798	RASA4B	PE3	Skeletal muscle; quasi undistinguishable from RASA4

neXtProt accession	Gene symbol	New PE	Sample suggestion for further analyses
NX_B1AH88	TSPO	PE5
NX_Q9NPU4	C14orf132	PE3	Brain; membrane fraction
NX_P0CF97	FAM200B	PE3	Brain; will be difficult to distinguish from FAM200A
NX_Q9BWV7	TTLL2	PE1? (in progress)	Testis
NX_Q5T036	FAM120AOS	PE3? (in progress)	Placenta?
NX_Q96SF2	CCT8L2	PE2	Testis; will be difficult to distinguish from CCT8L1P
NX_Q8IVY1	C1orf210	PE1
NX_P0CB46	CASP16	Deleted
NX_Q8N5Q1	FAM71E2	PE1
NX_Q96HZ7	C21orf119	PE5
NX_Q8IYS8	BOD1L2	PE2	Testis
NX_P0CG32	ZCCHC18	PE3	Brain?; will be difficult to distinguish from ZCCHC12
NX_C9J798	RASA4B	PE3	Skeletal muscle; quasi undistinguishable from RASA4

We will continue to provide UniProtKB/Swiss-Prot curators with transcriptomic evidence and other available information for all PE5 entries with transcriptional information in at least one of the available transcriptomics datasets (433 entries in total). We have started the process with the 11 PE5 entries which have RNA-seq information in HPA and EST information in UniGene, as well as information in one of the two microarray datasets. Following this work, the status of three entries (NX_Q8IYS8, BOD1L2; NX_P0CG32, ZCCHC18 and NX_C9J798, RASA4B) has been upgraded to PE2-3. According to the three datasets, BOD1L2 highest levels are found in testis. Interestingly, BOD1L2 was unambiguously identified by two peptides detected by mass spectrometry in spermatozoa samples (27). Therefore, the entry will probably be upgraded to PE1 soon. For ZCCHC18 and RASA4B, there is no clear consensus between the transcriptomic datasets, and even with the right sample, it will be difficult to unambiguously validate their existence at protein level due to strong similarity with ZCCHC12 (NX_Q6PEW1) and RASA4 (NX_O43374), respectively.

Conclusion

PepPSy has been developed as a user-friendly gene expression-based prioritization system, to help investigators to determine in which human tissues they should look for an unseen protein and curators to quickly look at available transcriptomics data for a list of protein. In this work, PepPSy has been applied to prioritize twenty-one proteins annotated as ‘Uncertain’ (PE5) in UniProtKB/Swiss-Prot and neXtProt for revision. As a result, 21 proteins have been provided transcriptomic evidence and biocurators have changed the status of eight of these based on all available information. PepPSy can now be used to choose the samples in which to look for the seven proteins that have been reclassified as PE2 or PE3, and to identify potentially problematic cases. In the near future, PepPSy will be used to revise the annotation of the 412 remaining PE5 entries with transcriptional information in at least one of the available transcriptomics datasets. Therefore, PepPSy has been revealed as an efficient companion tool for neXtProt biocuration and quality management workflows, and for the C-HPP project which aims to get an accurate picture of the validation status of all human protein coding genes. In the near future we will extend the scope of PepPSy to stay abreast of rapid technological advances. We will gather other relevant datasets in PepPSy to cover other biological topics by including other tissues, specific cell types from single-cell RNA-seq data, and chemical-induced/disease-associated experimental samples. Finally, we are also currently planning to develop a community tool embedded in PepPSy that will stimulate the annotation of other missing proteins by facilitating collaborative work.

Acknowledgements

The authors thank Antoine D. Rolland, Ramona Britto, Emmanuelle Becker, Laëtitia Guillot and the GenOuest bioinformatics facility for stimulating discussion as well as for beta-testing this web server. They also thank Lional Breuza and Sylvain Poux from the UniProt group at SIB for their input and their work in updating proteins in UniProtKB/Swiss-Prot. More generally, they thank all the curators from UniProt and the CCDS consortia for their dedication in providing up-to-date high-quality annotations for human genes and proteins, thus providing neXtProt with a solid foundation.

Funding

This work was supported by the ‘Agence nationale de sécurité sanitaire de l’alimentation, de l’environnement et du travail’ (ANSES) [grant number EST-13-081] and the ‘Fondation pour la recherche médicale’ (FRM) [grant number DBI20131228558] awarded to F.C. This project also benefited from European Union financial help (FEDER) [grant number 14MF434-01]. neXtProt development benefits from extensive funding support from the SIB Swiss Institute of Bioinformatics. The neXtProt server is hosted by VitalIT, the bioinformatics competence center that supports and collaborates with life scientists in Switzerland. Funding for open access charge: Institut national de la santé et de la recherche médicale (Inserm).

Conflict of interest. None declared.

References

Legrain

Aebersold

Archakov

. et al. . (

2011

)

The human proteome project: current state and future direction

Mol. Cell Proteomics

M111 009993

Breuza

Poux

Estreicher

. et al. . (

2016

)

The UniProtKB guide to the human proteome

Database (Oxford)

2016

OpenURL Placeholder Text

Gaudet

Michel

P.A.

Zahn-Zabal

. et al. . (

2015

)

The neXtProt knowledgebase on human proteins: current status

Nucleic Acids Res

D764

–

D770

Uhlen

Fagerberg

Hallstrom

B.M

. et al. . (

2015

)

Proteomics. Tissue-based map of the human proteome

Science

347

1260419.

NCBI Resource,C

. (

2016

)

Database resources of the National Center for Biotechnology Information

Nucleic Acids Res

–

D19

Crossref

PubMed

Safran

Dalah

Alexander

. et al. . (

2010

)

GeneCards Version 3: the human gene integrator

Database (Oxford)

2010

baq020.

Omenn

G.S.

Lane

Lundberg

E.K

. et al. . (

2015

)

Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification

J. Proteome Res

3452

–

3460

Farrell

C.M.

O'leary

N.A.

Harte

R.A

. et al. . (

2014

)

Current status and new features of the Consensus Coding Sequence database

Nucleic Acids Res

D865

–

D872

Pundir

Magrane

Martin

M.J

. et al. . (

2015

)

Searching and navigating UniProt databases

Curr. Protoc. Bioinf

, 1 27 21–21 27 10.

OpenURL Placeholder Text

Wang

Liu

Guo

. et al. . (

2014

)

CAPER 2.0: an interactive, configurable, and extensible workflow-based platform to analyze data sets from the Chromosome-centric Human Proteome Project

J. Proteome Res

–

106

Zhang

Lin

. et al. . (

2014

)

Discovery of novel genes and gene isoforms by integrating transcriptomic and proteomic profiling from mouse liver

J. Proteome Res

2409

–

2419

Chalmel

Rolland

A.D.

(

2015

)

Linking transcriptomics and proteomics in spermatogenesis

Reproduction

150

R149

–

R157

Diez

Droste

Degano

R.M

. et al. . (

2015

)

Integration of Proteomics and Transcriptomics Data Sets for the Analysis of a Lymphoma B-Cell Line in the Context of the Chromosome-Centric Human Proteome Project

J. Proteome Res

3530

–

3540

Segura

Medina-Aunon

J.A.

Mora

M.I

. et al. . (

2014

)

Surfing transcriptomic landscapes. A step beyond the annotation of chromosome 16 proteome

J. Proteome Res

158

–

172

Bruford

E.A.

Lane

Harrow

(

2015

)

Devising a consensus framework for validation of novel human coding loci

J. Proteome Res

14(12)

4945

–

Crossref

Dong

Menon

Omenn

G.S

. et al. . (

2015

)

Structural bioinformatics inspection of neXtProt PE5 proteins in the human proteome

J. Proteome Res

3750

–

3761

Carapito

Lane

Benama

. et al. . (

2015

)

Computational and mass-spectrometry-based workflow for the discovery and validation of missing human proteins: application to chromosomes 2 and 14

J. Proteome Res

3621

–

3634

Farrah

Deutsch

E.W.

Hoopmann

M.R

. et al. . (

2013

)

The state of the human proteome in 2012 as viewed through PeptideAtlas

J. Proteome Res

162

–

171

Britto

Sallou

Collin

. et al. . (

2012

)

GPSy: a cross-species gene prioritization system for conserved biological processes–application in male gamete development

Nucleic Acids Res

W458

–

W465

Kim

M.S.

Pinto

S.M.

Getnet

. et al. . (

2014

)

A draft map of the human proteome

Nature

509

575

–

581

Huntley

R.P.

Sawford

Mutowo-Meullenet

. et al. . (

2014

)

The GOA database: gene ontology annotation updates for 2015

Nucleic Acids Res

., 43(Database issue):D1057–63.

OpenURL Placeholder Text