FungiProteomeDB: a database for the molecular weight and isoelectric points of the fungal proteomes Open Access

Summary of DB table

Table name	Purpose	Dependencies
Fungi_kingdom_summary	Summary of fungi kingdom (the whole DB)	No dependency
Species	Summary of individual species	No dependency
species_1	Proteomic details of species number 1 (Absidia glauca)	Table name depends on species table id
683 DB tables between the first and last species	Proteomic details of each specific species	Table names depends on species table id
species_685	Proteomic details of species number 685 (Zymoseptoria tritici)	Table name depends on species table id

Table name	Purpose	Dependencies
Fungi_kingdom_summary	Summary of fungi kingdom (the whole DB)	No dependency
Species	Summary of individual species	No dependency
species_1	Proteomic details of species number 1 (Absidia glauca)	Table name depends on species table id
683 DB tables between the first and last species	Proteomic details of each specific species	Table names depends on species table id
species_685	Proteomic details of species number 685 (Zymoseptoria tritici)	Table name depends on species table id

Table 1.

Summary of DB table

Table name	Purpose	Dependencies
Fungi_kingdom_summary	Summary of fungi kingdom (the whole DB)	No dependency
Species	Summary of individual species	No dependency
species_1	Proteomic details of species number 1 (Absidia glauca)	Table name depends on species table id
683 DB tables between the first and last species	Proteomic details of each specific species	Table names depends on species table id
species_685	Proteomic details of species number 685 (Zymoseptoria tritici)	Table name depends on species table id

Table name	Purpose	Dependencies
Fungi_kingdom_summary	Summary of fungi kingdom (the whole DB)	No dependency
Species	Summary of individual species	No dependency
species_1	Proteomic details of species number 1 (Absidia glauca)	Table name depends on species table id
683 DB tables between the first and last species	Proteomic details of each specific species	Table names depends on species table id
species_685	Proteomic details of species number 685 (Zymoseptoria tritici)	Table name depends on species table id

Figure 1.

Home page of the FungiProteomeDB. It shows the basic information about the DB.

The construction of the ‘search protein sequence’ was also completed in two steps:

‘Preprocessing of the data’: For the development of this module, Fasta files were converted to a csv format using a python language script and then imported into the MySQL Server. The csv files were then compressed for efficient memory use.

In the proteome module (page), to ‘search proteins by molecular weight (MW) range’, the user has to select a species (otherwise, first species is by default selected). After selecting species, the user selects start and end values of MW from a range widget.

In the proteome module (page), to ‘search proteins by isoelectric point (pI) range’, the user has to select a species (otherwise, first species is by default selected). After selecting species, the user selects the start and end values of the pI from a range widget.

Front end and backend languages and tools

Languages and tools used in FungiProteomeDB are as follows:

Backend languages and tools

The backend uses the MariaDB DB to store all the data. PHP was used as a server-side scripting language that responds to client-side requests and interacts with the DB. An open-source and secure platform Codeigniter 3.1.11 was used to securely run PHP scripts. Summary results were calculated by running SQL queries directly to phpMyAdmin.

Front end languages and tools

JavaScript library of JQuery and its different extensions, i.e. JQuery-IU, DataTables and Select2, were used to make website HTML contents attractive and dynamic. Scatter plots (2D proteome map) were created using the JavaScript library convasjs.js. Different CSS libraries of JQuery and Bootstrap were also used to add beauty and attractiveness to web pages.

Results

In addition to the home page https://vision4research.com/fungidb/, four more web pages were developed (Figure 1).

Time efficient search interface by species and their attributes

In all 685 species names, a total number of proteins were further divided into neutral, acidic and basic pI proteins. This information is fetched automatically from the DB table at the time of page load, hence providing the latest information. Species can be quickly searched by name or any attribute available in the table by search box auto-focused and displayed at the top of the page (Figure 2).

Figure 2.

Time efficient search interface by species and their attribute.

Species attributes sorting

All the following attributes of species can be sorted in both ascending and descending orders by just clicking the heading of the row:

Serial or by default order number
Name of the species
Total number of protein sequences
Number of acidic pI proteins
Number of neural pI proteins
Number of basic pI proteins.

Species records per row

By default, there are 10 rows per page. Users can also change it to 25, 50 and 100 records per page. These customized data can be further used by the users for research with ease as it will provide information in a structured format (suitable for further analysis).

Species download in verities of formats

Users can copy the species in a clipboard or download it in csv, Excel, or pdf format. Moreover, a user can print the retrieved results.

Species pagination

Users can navigate previous, next or any other specific pages.

Virtual 2D proteome map

It can be viewed by clicking the ‘View Map’ button against a species name (Figure 2, seventh column). Figure 3 shows the 2D map of Sphaerobolus stellatus. It is a bi-modal distribution, showing less variation in acidic proteins regarding pI, and some values have more MW than neutral or acidic proteins. A few instances of neutral protein can be seen on a vertical line where pI = 7. The top left row of the map also shows the total proteins (#dots) represented in the map, and it was 35 181 in the case of S. stellatus.

Figure 3.

Virtual 2D map of fungal proteome. A representative 2D proteome map of Sphaerobolus stellatus is presented here.

Proteome search interface

There are four types of search options on each search mode in which proteins can be viewed. There are two search modes of (i) substring search and (ii) multi-select search on the proteome page. Figure 4 is the default view of the proteome page https://vision4research.com/fungidb/pages/proteomes when it loads. See Figures 5 and 6 for help with how to use this interface for substring search mode and Figures 7 and 8 for multi-select search mode.

Figure 4.

Proteome search interface of substring search mode. User can search the proteome data using any one option or multiple options.

Figure 5.

User-friendly interface with autofill option to (A) select species using auto-filling options or from a drop-down list and (B) search specific proteome using accession numbers.

Figure 6.

Customized search results using the substring search mode of proteome search interface.

Figure 7.

Proteome search interface of multi-select search mode. User can search the proteome data using any one option or multiple options.

Figure 8.

Customized search results using the multi-string search mode of proteome search interface.

‘Proteome/species search’: In both search modes, a single species was selected in this initial search. Figure 5A shows that a user can change species using auto-filling options or from a drop-down list.
‘Proteome accession number search’: In substring search mode, by default, all accession numbers of selected species will be selected, but the user can search specific proteome accession numbers (Figure 5B). An autocomplete option is also provided here, in which the accession number will be shown and can be selected. A user can also search substrings on her/his will. Internally, this option of autofill was achieved with wild card search in SQL statement, ‘%substring%’. In multi-select search mode, by default, no accession number will be selected, but the user can select multiple accession numbers by multi-select menu.
‘Search by protein name’: In substring search mode, by default, all protein names of selected species will be selected, but the user can also search specific proteins by name. An autocomplete is provided in which the protein domain name will be shown and can be selected. However, the user can also search for substrings if they want. The substring will fetch data in by wild card like ‘%substring%’. In multi-select search mode, by default, no protein name will be selected, but the user can select multiple protein names by multi-select menu.
‘Search by sequence’: In substring search mode, by default, all sequences of selected species will be selected, but the user can also search for specific protein sequences. An autocomplete is provided in which protein sequence will be shown and can be selected. However, users can also search for substrings if they want. The substring will fetch data in by wild card like ‘%substring%’. In multi-select search mode, no sequence will be selected by default, but the user can select multiple sequences by the multi-select menu.

Figure 6 also shows the retrieved results table (at the bottom) of the substring search mode and Figure 8 of the multi-select search mode with a search option to apply further search on the retrieved results table. We can see other features of the proteome search interface in Figures 6 and 8 as well:

‘Sub-search’: The fetched protein species can further be searched for any attribute shown in the fetched table by search box. All the seven columns (listed below) in the retrieved table can be sorted in both ascending and descending order by just clicking the heading of the row:
‘Proteins records per row’: As shown in the first button in Figures 6 and 8, there are 10 rows per page by default. Users can also change it to 10, 25, 50 and 100 and show all records per page.
‘Download result in verities of formats’: A user can copy the species in the clipboard or download it in csv, Excel, or pdf format. And by the last ‘Print’ Button, user can print the fetched proteins.
‘Pagination of retrieved results’: A user can navigate the previous, next or specific pages.

Summary statistics

The module provides users with an overview of the overall statistics of the DB. The general statistics provided for the proteome of each species are:

Sequence count
Average MW and average pI (per each proteome)
Average MW and average pI (per each protein)
Number of acidic, basic and neutral pI proteins
Percentage of acidic, basic and neutral pI proteins.

The overall statistics of the FungiProteomeDB DB https://vision4research.com/fungidb/ is provided in Table 2.

Table 2.

Statistics of fungi DB

Number of species	685	Number of proteins	7 127 141
Proteomes vs. proteins summary		pI protein types summary
Maximum number of proteins	35 181	Number of acidic (pI) proteins	4 407 000
Minimum number of proteins	17	Number of neutral (pI) proteins	11 990
Average number of proteins	10 405	Number of basic (pI) proteins	2 708 151
Proteomes vs. pI summary		Proteomes vs. MW summary
Maximum pI (in all proteomes)	234 364	Maximum MW (all proteomes)	1 210 411 kDa
Minimum pI (in all proteomes)	122.1	Minimum MW (all proteomes)	610.27 kDa
Average pI (in each proteome)	69 437	Average MW (each proteome)	520 330 kDa
Maximum pI (in all proteins)	13.76	Maximum MW (all proteins)	2546.17 kDa
Minimum pI (in all proteins)	0	Minimum MW (all proteins)	0.003732 kDa
Average pI (in each protein)	6.67	Average MW (each protein)	50 kDa
Proteomes vs. sequence letters summary		Proteins vs. sequences letters summary
Maximum number of amino acids (all proteomes)	10 940 068	Maximum sequence length (all proteins)	23 089
Minimum number of amino acids (all proteomes)	5417	Minimum sequence length (all proteins)	1
Average number of amino acids (each proteome)	4 693 828	Average sequence length (each protein)	451
Total number of amino acids (all proteins)	3 215 271 966

Table 2.

Statistics of fungi DB

Number of species	685	Number of proteins	7 127 141
Proteomes vs. proteins summary		pI protein types summary
Maximum number of proteins	35 181	Number of acidic (pI) proteins	4 407 000
Minimum number of proteins	17	Number of neutral (pI) proteins	11 990
Average number of proteins	10 405	Number of basic (pI) proteins	2 708 151
Proteomes vs. pI summary		Proteomes vs. MW summary
Maximum pI (in all proteomes)	234 364	Maximum MW (all proteomes)	1 210 411 kDa
Minimum pI (in all proteomes)	122.1	Minimum MW (all proteomes)	610.27 kDa
Average pI (in each proteome)	69 437	Average MW (each proteome)	520 330 kDa
Maximum pI (in all proteins)	13.76	Maximum MW (all proteins)	2546.17 kDa
Minimum pI (in all proteins)	0	Minimum MW (all proteins)	0.003732 kDa
Average pI (in each protein)	6.67	Average MW (each protein)	50 kDa
Proteomes vs. sequence letters summary		Proteins vs. sequences letters summary
Maximum number of amino acids (all proteomes)	10 940 068	Maximum sequence length (all proteins)	23 089
Minimum number of amino acids (all proteomes)	5417	Minimum sequence length (all proteins)	1
Average number of amino acids (each proteome)	4 693 828	Average sequence length (each protein)	451
Total number of amino acids (all proteins)	3 215 271 966

Discussion

Several DBs provide important information about different organisms’ genomic and proteomic aspects. The proteomics DB (https://www.proteomicsdb.org/) reported by Wilhelm et al. described the mass spectrometry (MS)-based draft of the human proteome (58). They reported the presence of conserved controlled protein abundance when comparing the messenger ribonucleic acid and protein expression profiles (58). Furthermore, their analysis with integrated drug-sensitivity data enabled them to identify resistant or susceptible proteins for a particular drug (58). The proteome profile can also enable understanding the stoichiometry and composition of the protein complexes (58). ProteomeXchange mission was developed to provide global coordinated standard data submission and dissemination for comparative analysis and extraction of novel findings from the published data (59). PRIDE (http://www.ebi.ac.uk/pride) (PRteomics IDEntifications) DB enables publicly available MS data to publicly accessible data for comparative and functional proteomic. PeptideAtlas (http://www.peptideatlas.org/#) provides access to the compendium of peptides identified in MS experiments (60). It uses the mass spectrometer output files from various organisms and searches using the latest search engines and protein sequences (60). The PeptideAtlas uses MS data of small peptides and enables them to map with the genome of the eukaryotic organism (60). A considerable analytical process with constant statistical validation leads to identifying peptides and proteins (60). The Arabidopsis PeptideAtlas was developed to harness worldwide proteomic data for comprehensive proteomic community resources (61). It provides proteomic information on post-translational modification and splice forms of specific proteins (61). The PeptideAtlas identified 17 858 unique proteins at the highest confidence level (61). The plant proteome DB (http://ppdb.tc.cornell.edu/) reports the experimental data of proteome and MS analysis. PlantMwpIDB reported the proteomic details of plant proteomes using proteins’ MW and pI (62). It reports curated information on protein function, subcellular localization and protein properties (63). The fungal secretome DB (https://fsd.snu.ac.kr/) reported the secretary proteins of 158 fungal species comprising 208 883 proteins (64). It comprises 15.21% of the total proteome. Although these fungi-related DBs were constructed to elucidate the proteomic details, they were mainly based on experimental MS data. Therefore, it is challenging to elucidate the proteomic information of a large number of proteins. We used the MW and pI data to overcome the issue and construct the DB. This will enable us to find the proteins with acidic and basic pI proteins. The basic pI proteins usually reside in the basic pH range cellular compartment (65). Proteome-pI and Proteome-pI 2.0 reported the MW and pI of 20 115 proteomes (66, 67). Kozlowski et al. reported pIs of different proteomes using 21 algorithms (66). They have studied the proteomes of viruses, archaea, bacteria and eukaryotes. However, they have not differentiated the different kingdoms of the eukaryotic lineages. Identifying a specific protein’s MW and pI from a specific species is confusing. The lack of a suitable classification of plant, animal and fungi lineage makes it difficult to use effectively. Furthermore, Proteome-pI 2.0 does not have a specific option to search the MW and pIs of a specific protein in a proteome. A more critical aspect of Proteome-pI 2.0 is using 21 different parameters IPC2.peptide.svr19, IPC2.protein.svr19, Wikipedia, Toseland, Turlkill, Solomon, Sillero, Rodwell, ProMoST, Patrickios, Nozaki, Lehninger, IPC_protein, IPC_peptide, IPC2_protein, g Bjellqvist, DTASelect, Dawson, EMBOSS, and Grimsley, IPC2_peptide (66). These 21 parameters result in 21 different MWs and pIs for a single protein/peptide. When there are 21 different variations of a single sample, it becomes confusing to accept the suitable output. A particular algorithm is more promising than the 21 algorithms to calculate the MW and pI. Therefore, we used only one algorithm in our study, i.e. IPC_protein, and constructed the DB FungiProteomeDB.

Conclusion and future work

The proposed FungiProteomeDB allows researchers to retrieve information on the MW and pI of proteins within the proteomes of 685 fungi species. FungiProteomeDB is a comprehensive DB available for fungi proteomes and contains several modules for searching, retrieving and saving data. Future versions of FungiProteomeDB will make the DB more powerful for obtaining information on the proteome of the entire fungi kingdom. It will also include a protein molecular modeling module to decipher the 3D structure of each protein, target site prediction for metacaspases, palmitoylation, myristiylation and methylation for each protein. This additional information will provide important information to researchers investigating protein modification, function, structure and evolution. Currently (in the proteins search), only one species can be searched by different attributes. In our future version, any species will be searchable at a time by any attribute(s) number. Moreover, we want to add an option for registered users with admin privileges, who can upload new MWs and pIs of different species or a protein or its annotation. It will be part of a DB that automatically allows the submission of proteomic data and all the related information. We will also like to search and summarize unique biomarkers in the fungi kingdom [a patch of an amino acid subsequence of length n (n ≥ 2–5), which is present in the whole proteome file]. Currently, species sort by count values of protein or pI is provided; we would also like to add sort by sum values of pI or MW.

Data availability

All the data used in this manuscript are taken from the publicly available “National Center For Biotechnology Information” (NCBI) database and all the data can be found in our database.

Author contribution

T.K.M. conceived the idea, collected and calculated the MW and pI of proteins, analyzed and interpreted the data and drafted and revised the manuscript. M.R. and M.O. analyzed and interpreted the data, drafted and revised the manuscript, designed, constructed and tested the DB.

Conflict of interest

None declared.

References

Chopra

Mishra

A.K.

Baig

A.M.

et al. (

2021

)

Bioactive potential of various mushrooms as the treasure of versatile therapeutic natural product

J. Fungi.

, 728.

Mustafa

Chopra

Baig

A.A.

et al. (

2022

)

Edible mushrooms as novel myco-therapeutics: effects on lipid level, obesity and BMI

J. Fungi.

, 211.

Mohanta

Nayak

Biswas

et al. (

2018

)

Silver nanoparticles synthesized using wild mushroom show potential antimicrobial activities against food borne pathogens

Molecules

, 655.

Mohanta

T.K.

and

Bae

(

2015

)

The diversity of fungal genome

Biol. Proced. Online

, 8.

Rana

K.L.

Kour

Sheikh

et al. (

2019

) Biodiversity of Endophytic Fungi from Diverse Niches and Their Biotechnological Applications. In:

Singh

(ed)

Advances in Endophytic Fungal Research: Present Status and Future Challenges

Springer International Publishing

Cham

, pp.

105

–

144

Raghukumar

Damare

S.R.

and

Singh

(

2010

)

A review on deep-sea fungi: occurrence, diversity and adaptations

Botanica Marina

479

–

492

Bergero

Girlanda

Varese

G.C.

et al. (

1999

)

Psychrooligotrophic fungi from Arctic soils of Franz Joseph Land

Polar Biol.

361

–

368

Robinson

C.H.

(

2001

)

Cold adaptation in Arctic and Antarctic fungi

New Phytol.

151

341

–

353

Sieverding

(

1990

)

Ecology of VAM fungi in tropical agrosystems

Agric. Ecosyst. Environ.

369

–

390

10.

Mohanta

Singdevsachan

Parida

et al. (

2016

)

Green synthesis and antimicrobial activity of silver nanoparticles using wild medicinal mushroom Ganoderma applanatum (Pers .) Pat. from Similipal Biosphere Reserve, Odisha, India

IET Nanobiotechnol.

184

–

189

11.

Hankin

and

Anagnostakis

S.L.

(

1975

)

The use of solid media for detection of enzyme production by fungi

Mycologia

597

–

607

12.

Gurr

Samalova

and

Fisher

(

2011

)

The rise and rise of emerging infectious fungi challenges food security and ecosystem health

Fungal Biol. Rev.

181

–

188

13.

Borman

A.M.

Linton

C.J.

Miles

S.-J.

et al. (

2008

)

Molecular identification of pathogenic fungi

J. Antimicrob. Chemother.

–

i12

14.

Mohanta

T.K.

Hashem

Abd_Allah

E.F.

et al. (

2021

)

Fungal genomes: suffering with functional annotation errors

IMA Fungus

, 32.

15.

Grigoriev

I.V.

Cullen

Goodwin

S.B.

et al. (

2011

)

Fueling the future with fungal genomics

Mycology

192

–

209

16.

Haridas

Salamov

Grigoriev

I.V.

(

2018

) Fungal Genome Annotation. In:

de Vries

Tsang

Grigoriev

(eds)

Fungal Genomics: Methods and Protocols

Springer New York

New York

, pp.

171

–

184

17.

Ehleringer

J.R.

and

Monson

R.K.

(

1993

)

Evolutionary and ecological aspects of photosynthetic pathway variation

Annu. Rev. Ecol. Syst.

411

–

439

18.

Delahunty

and

Yates

J.R.

III (

2005

)

Protein identification using 2D-LC-MS/MS

Methods

248

–

255

19.

Vandenbogaert

Li-Thiao-Té

Kaltenbach

H.-M.

et al. (

2008

)

Alignment of LC-MS images, with applications to biomarker discovery and protein identification

Proteomics

650

–

672

20.

Nørregaard Jensen

(

2004

)

Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry

Curr. Opin. Chem. Biol.

–

21.

Mann

and

Jensen

O.N.

(

2003

)

Proteomic analysis of post-translational modifications

Nat. Biotechnol.

255

–

261

22.

Mohanta

T.K.

Bashir

Hashem

et al. (

2017

)

Systems biology approach in plant abiotic stresses

Plant Physiol. Biochem.

121

–

23.

Weston

A.D.

and

Hood

(

2004

)

Systems biology, proteomics, and the future of health care: toward predictive, preventative, and personalized medicine

J. Proteome Res.

179

–

196

24.

Ebhardt

H.A.

Root

Sander

et al. (

2015

)

Applications of targeted proteomics in systems biology and translational medicine

Proteomics

3193

–

3208

25.

Lakshman

D.K.

Natarajan

S.S.

Lakshman

et al. (

2008

)

Optimized protein extraction methods for proteomic analysis of Rhizoctonia solani

Mycologia

100

867

–

875

26.

Bouws

Wattenberg

and

Zorn

(

2008

)

Fungal secretomes—nature’s toolbox for white biotechnology

Appl. Microbiol. Biotechnol.

, 381.

27.

Kim

Nandakumar

M.P.

and

Marten

M.R.

(

2008

)

The state of proteome profiling in the fungal genus Aspergillus

Brief. Funct. Genom

–

28.

Carberry

Neville

C.M.

Kavanagh

K.A.

et al. (

2006

)

Analysis of major intracellular proteins of Aspergillus fumigatus by MALDI mass spectrometry: identification and characterisation of an elongation factor 1B protein with glutathione transferase activity

Biochem. Biophys. Res. Commun.

341

1096

–

1104

29.

Braaksma

Martens-Uzunova

E.S.

Punt

P.J.

et al. (

2010

)

An inventory of the Aspergillus niger secretome by combining in silico predictions with shotgun proteomics data

BMC Genom.

, 584.

30.

Fernández-Acero

F.J.

Colby

Harzen

et al. (

2010

)

2-DE proteomic approach to the Botrytis cinerea secretome induced with different carbon sources and plant-based elicitors

Proteomics

2270

–

2280

31.

Cagas

S.E.

Raja

J.M.

Hong

et al. (

2011

)

Profiling the Aspergillus fumigatus proteome in response to caspofungin

Antimicrob. Agents Chemother.

146

–

154

32.

Ijaq

Malik

Kumar

et al. (

2019

)

A model to predict the function of hypothetical proteins through a nine-point classification scoring schema

BMC Bioinform.

, 14.

33.

Mohanta

T.K.

Khan

A.L.

Hashem

et al. (

2019

)

The molecular mass and isoelectric point of plant proteomes

BMC Genom.

, 631.

34.

Mohanta

T.K.

Mishra

A.K.

Khan

et al. (

2021

)

Virtual 2-D map of the fungal proteome

Sci. Rep.

, 6676.

35.

Wolf

Lucas

W.J.

Deom

C.M.

et al. (

1989

)

Movement protein of tobacco mosaic virus modifies plasmodesmatal size exclusion limit

Science (80-.)

246

377 LP

–

379

36.

Ivankov

D.N.

Garbuzynskiy

S.O.

Alm

et al. (

2003

)

Contact order revisited: influence of protein size on the folding rate

Protein Sci.

2057

–

2062

37.

Hishigaki

Nakai

Ono

et al. (

2001

)

Assessment of prediction accuracy of protein function from protein–protein interaction data

Yeast

523

–

531

38.

Kudlow

J.E.

(

2006

)

Post-translational modification by O-GlcNAc: another way to change protein function

J. Cell. Biochem.

1062

–

1075

39.

Belizaire

and

Unanue

E.R.

(

2009

)

Targeting proteins to distinct subcellular compartments reveals unique requirements for MHC class I and II presentation

Proc. Natl. Acad. Sci.

106

17463 LP

–

17468

40.

Park

Choi

S.S.

and

K.-S.

(

2010

)

Transglutaminase 2: a multi-functional protein in multiple subcellular compartments

Amino. Acids

619

–

631

41.

Ugo

Marafini

and

Meneghello

(

2021

)

From biomolecular recognition to nanobiosensing

Bioanal. Chem.

, pp.

–

42.

Erickson

H.P.

(

2019

)

Kinetics of protein–protein association and dissociation

Princ. Protein–Protein Assoc.

2019

–

43.

Y.C.

Koch

W.F.

Berezansky

P.A.

et al. (

1992

)

The dissociation constant of amino acids by the conductimetric method: I. pK1 of MOPSO-HCl at 25°C

J. Solution Chem.

597

–

605

44.

Das

R.K.

Crick

S.L.

and

Pappu

R.V.

(

2012

)

N-terminal segments modulate the α-helical propensities of the intrinsically disordered basic regions of bZIP proteins

J. Mol. Biol.

416

287

–

299

45.

Vamvaca

Volles

M.J.

and

Lansbury

P.T.

(

2009

)

The first N-terminal amino acids of α-synuclein are essential for α-helical structure formation in vitro and membrane binding in yeast

J. Mol. Biol.

389

413

–

424

46.

Requião

R.D.

Fernandes

de Souza

H.J.A.

et al. (

2017

)

Protein charge distribution in proteomes and its impact on translation

PLoS Comput. Biol.

, e1005549.

47.

von Heijne

(

1986

)

Net N-C charge imbalance may be important for signal sequence function in bacteria

J. Mol. Biol.

192

287

–

290

48.

von Heijne

(

1984

)

Analysis of the distribution of charged residues in the N-terminal region of signal sequences: implications for protein export in prokaryotic and eukaryotic cells

EMBO J.

2315

–

2318

49.

F.-M.L.

and

Q.-Z.

(

2008

)

Predicting protein subcellular location using chous pseudo amino acid composition and improved hybrid approach

Protein Pept. Lett.

612

–

616

50.

Park

K.-J.

and

Kanehisa

(

2003

)

Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs

Bioinformatics

1656

–

1663

51.

Pierleoni

Martelli

P.L.

Fariselli

et al. (

2007

)

eSLDB: eukaryotic subcellular localization database

Nucleic Acids Res.

D208

–

D212

52.

Rastogi

and

Rost

(

2011

)

LocDB: experimental annotations of localization for Homo sapiens and Arabidopsis thaliana

Nucleic Acids Res.

D230

–

D234

53.

Negi

Pandey

Srinivasan

S.M.

et al. (

2015

)

LocSigDB: a database of protein localization signals

Database

2015

–

54.

Guo

Liu

et al. (

2016

)

Human protein subcellular localization with integrated source and multi-label ensemble classifier

Sci. Rep.

, 28087.

55.

Orre

L.M.

Vesterlund

Pan

et al. (

2019

)

SubCellBarCode: proteome-wide mapping of protein localization and relocalization

Mol. Cell

166

–

182.e7

56.

Wan

Mak

M.-W.

and

Kung

S.-Y.

(

2012

)

mGOASVM: multi-label protein subcellular localization based on gene ontology and support vector machines

BMC Bioinform.

, 290.

57.

Kozlowski

L.P.

(

2016

)

IPC – Isoelectric Point Calculator

Biol. Direct

, 55.

58.

Wilhelm

Schlegl

Hahne

et al. (

2014

)

Mass-spectrometry-based draft of the human proteome

Nature

509

582

–

587

59.

Vizcaíno

J.A.

Deutsch

E.W.

Wang

et al. (

2014

)

ProteomeXchange provides globally coordinated proteomics data submission and dissemination

Nat. Biotechnol.

223

–

226

60.

Desiere

Deutsch

E.W.

King

N.L.

et al. (

2006

)

The PeptideAtlas project

Nucleic Acids Res.

D655

–

D658

61.

van Wijk

K.J.

Leppert

Sun

et al. (

2021

)

The Arabidopsis PeptideAtlas: harnessing worldwide proteomics data to create a comprehensive community proteomics resource

Plant Cell

3421

–

3453

62.

Mohanta

T.K.

Kamran

M.S.

Omar

et al. (

2022

)

PlantMWpIDB: a database for the molecular weight and isoelectric points of the plant proteomes

Sci. Rep.

–

63.

Sun

Zybailov

Majeran

et al. (

2009

)

PPDB, the plant proteomics database at Cornell

Nucleic Acids Res.

D969

–

D974

64.

Choi

Park

Kim

et al. (

2010

)

Fungal secretome database: integrated platform for annotation of fungal secretomes

BMC Genom.

, 105.

65.

Ferro

Salvi

Rivière-Rolland

et al. (

2002

)

Integral membrane proteins of the chloroplast envelope: identification and subcellular localization of new transporters

Proc. Natl. Acad. Sci.

11487

–

11492