Abstract

Fungi produce a wide range of extracellular enzymes to break down plant cell walls, which are composed mainly of cellulose, lignin and hemicellulose. Among them are the glycoside hydrolases (GH), the largest and most diverse family of enzymes active on these substrates. To facilitate research and development of enzymes for the conversion of cell-wall polysaccharides into fermentable sugars, we have manually curated a comprehensive set of characterized fungal glycoside hydrolases. Characterized glycoside hydrolases were retrieved from protein and enzyme databases, as well as literature repositories. A total of 453 characterized glycoside hydrolases have been cataloged. They come from 131 different fungal species, most of which belong to the phylum Ascomycota. These enzymes represent 46 different GH activities and cover 44 of the 115 CAZy GH families. In addition to enzyme source and enzyme family, available biochemical properties such as temperature and pH optima, specific activity, kinetic parameters and substrate specificities were recorded. To simplify comparative studies, enzyme and species abbreviations have been standardized, Gene Ontology terms assigned and reference to supporting evidence provided. The annotated genes have been organized in a searchable, online database called mycoCLAP (Characterized Lignocellulose-Active Proteins of fungal origin). It is anticipated that this manually curated collection of biochemically characterized fungal proteins will be used to enhance functional annotation of novel GH genes.

Database URL: http://mycoCLAP.fungalgenomics.ca/

Introduction

Plant cell walls are composed mainly of cellulose, lignin and hemicellulose. This composite is often referred to as lignocellulose, and is the most abundant renewable resource that has the potential of replacing fossil fuels in the production of a wide spectrum of fuels, chemicals and materials. One of the key challenges facing the widespread use of lignocellulose for fuel and chemical production is in finding economically and environmentally sustainable solutions to the conversion of lignocellulose into sugar building blocks. The fungal kingdom encompasses tremendous genetic diversity, and by virtue of secreted enzymes many of its members are potent decomposers of plant cell walls. Glycoside hydrolases (GH) are the most diverse group of enzymes used by microbes in the degradation of biomass. Over a hundred GH families have been classified to date (1–5). Many of them are responsible for the hydrolysis of the carbon–oxygen–carbon bonds that link the sugar residues in cellulose and hemicelluloses (6,7). Although aided by other enzymes, it is the glycoside hydrolases that degrade the main chains of these polysaccharides, thus potentially having the greatest impact on the conversion of lignocellulose. The discovery of efficient glycoside hydrolases and the development of optimal combinations of these enzymes are two important approaches in reducing the cost of bioconversion.

To support the discovery of novel biomass-degrading enzymes, an increasing number of genomes of lignocellulolytic fungi are being sequenced (8–19). This has resulted in numerous sequences, which are mostly annotated electronically or are without annotation. Current databases do not distinguish biochemically characterized data from electronically annotated data. Running a query sequence against one of these databases results in a long list of hits ranked according to highest percentage identity and coverage. Results must be sorted and individually evaluated to determine those electronically annotated from those whose function had been determined experimentally. To make the annotation process accurate and efficient, it is important to be able to easily link sequence information with the biochemically characterized properties of closely related sequences.

In this study, we have curated and annotated a comprehensive set of fungal genes encoding characterized GH family enzymes. This data set forms the basis of a searchable database of genes and their gene products, along with experimentally characterized biochemical properties, which is meant to be an ongoing, collaborative tool for fungal genome annotation and enzyme discovery.

Methods

Defining characterized glycoside hydrolases

For the purpose of this study, the term ‘characterized glycoside hydrolase’ refers to a protein that has satisfied the following criteria: (i) the gene sequence has been deposited in a public repository; (ii) the gene product has been assayed for a specific GH activity; and (iii) biochemical properties of the gene product have been reported in a peer-reviewed journal.

Literature survey

The EC Explorer on BRENDA [The Comprehensive Enzyme Information System (20)], http://www.brenda-enzymes.org/, was used as a guide for the different types of GH activities. The EC number 3.2.1, representing GH family enzymes, was selected on the Explorer. Under each 3.2.1.X, the table with the first column entitled ‘Organism’ was used as the starting point for collecting literature. Only literature associated with organisms of fungal origin were investigated further.

BRENDA provides either a direct link to the article on PubMed or cites the original publication. In the latter case, the Google Scholar ‘Advanced Search’ <http://scholar.google.com/advanced_scholar_search> was used to obtain the article of interest from another online resource. If an article was unobtainable through either PubMed or Google Scholar, a hard copy was ordered through an interlibrary loan system using the citation provided by BRENDA.

Once BRENDA was exhausted as a resource, PubMed was used. ‘MyNCBI’ was used to filter searches, keep track of the results and email to the curator any new additions that met the saved search criteria. Keyword searches were used to find articles of interest on PubMed. Each GH activity type listed on BRENDA was used as a keyword. Filters and limits were used to narrow the search results down to characterized enzymes of fungal origin.

Finding the sequence associated with the literature

If the sequences were available on GenBank (21), PubMed provided links to the gene and protein pages associated with the article. For articles from other sources and those PubMed articles without links to GenBank, the full text was searched for a sequence accession number and the associated database. For example, articles about glycoside hydrolases from the fungus Rhizopus oryzae usually refer to gene or protein identifiers from the Fungal Genome Initiative at the Broad Institute.

In some articles, the whole amino acid sequence was published but without an accession number. In these cases, the sequence was entered into BLASTp to search for a sequence ID in the appropriate database. UniProt (22) and GenBank were used as the default databases to search unless the species was known to have been sequenced by one of the major sequencing centers. A hit from the same organism having 100% identity and coverage with the query was considered a match.

On occasion, sequences were found by keyword search. Using the enzyme activity and name of the organism on GenBank or UniProt returned a list of hits. If the hit cited a published article of interest, the match was considered successful.

Cataloging characterized glycoside hydrolases of Fungal Origin

Data from published articles meeting our criteria of characterized glycoside hydrolases were organized in a spreadsheet format. The genes encoding the glycoside hydrolases were assigned unique identifiers. They were listed along the vertical rows and the data associated with the genes were recorded on the horizontal columns. Table 1 lists the titles of the columns included in the spreadsheet and the types of information described under each column. For the ‘Literature’ column, PubMed identification numbers (PMIDs) identified articles that described the characterization while literature that was not available on PubMed was identified by its DOI number or similar reference ID.

Table 1.

The types of information extracted from the characterization literature

Column NumberTitleType of information
1Entry nameThe unique identifier representing the enzyme. It incorporates the enzyme activity, the GH family it belongs to, and the phylogenetic origin of the enzyme.
2Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
3SpeciesThe genus and species of the enzyme’s natural host.
4StrainThe strain of fungus used to obtain the gene and/or enzyme.
5Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
6Gene aliasAny other names the gene is referred to in the literature or sequence databases.
7Enzyme nameThe name most commonly used to identify an enzyme of a specific activity type.
8Enzyme aliasAny other names the gene product (enzyme) is referred to in literature or public databases.
9Systematic nameThe systematic enzyme name according to the EC. <http://www.brenda-enzymes.org/>
10The EC numberA numerical classification of enzymes based on the reactions they catalyze. <http://www.chem.qmul.ac.uk/iubmb/>
11Gene ID (GenBank)The nucleotide sequence ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
12UniProt IDThe ID issued to each protein in the UniProt database. <http://www.uniprot.org/>
13Protein ID (GenBank)The protein ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
14Characterization literatureThe ID of the literature describing the enzyme’s characterization and properties recorded as PMID (PubMed ID) or CSFGID (Centre for Structural and Functional Genomics).
15Structure literatureThe PMID or CSFGID of any literature describing the structure of the enzyme if available.
16GH familyThe GH family the enzyme belongs to. <http://www.cazy.org/>
17AssayThe experiment used to determine the function and/or properties of the enzyme.
18Activity assay conditionsThe buffer, pH and temperature used in the assay to determine the enzyme activity
19Kinetic assay conditionsThe buffer, pH and temperature in which the Km, kcat and/or Vmax were determined.
20SubstratesThe chemical substrates used to assay that the enzyme was assayed on.
21Host (recombinant expression)The organism used to produce the recombinant enzyme for the experimental assay.
22Specific activityThe activity of the purified enzyme on the given substrate. Recorded in U/mg where 1 U (unit) = 1 μmol/min/mg = 16.67 nkat/mg.
23Substrate specificityThe activity of the enzyme on a given substrate compared to other substrates tested. Expressed as a percentage with the highest activity usually equal to 100%.
24KmThe Michaelis–Menten constant (Km) reflects the concentration of substrate at which initial velocity is one-half Vmax. Recorded in millimolar (mM) or milligrams/milliliter (mg/ml).
25kcatThe maximum number of reactions the enzyme catalyzes in one second (s−1).
26VmaxThe maximum velocity measured in U/mg at which an enzyme catalyzes a reaction. Reported in different ways, often as U/mg.
27pH optimumThe pH at which enzyme activity is maximal highest.
28pH stabilityThe pH range over which the enzyme is able to remain active. retains maximal activity (usually ≥80%) under the conditions defined in the paper
29Temperature optimumThe temperature (°C) at which enzyme activity is maximal.
30Temperature stabilityThe temperature (°C) beyond which the enzyme activity (usually ≥20%) is lost under the conditions defined in the study.
31Isoelectric point (theoretical)The pI of the enzyme calculated from its amino acid composition.
32Isoelectric point (experimental)The pI of the enzyme determined by isoelectric focusing.
33Molecular weight (theoretical)The molecular weight (kDa) of the enzyme calculated from its amino acid composition
34Molecular weight (experimental)The molecular weight (kDa) of the enzyme estimated using SDS–PAGE, gel filtration, etc.
35Protein lengthThe number of amino acids in the enzyme before cleavage of the signal peptide (unless stated otherwise).
36Signal peptideThe number of amino acids comprising the signal peptide, which targets the enzyme for secretion.
37CBDCarbohydrate binding domain if present as part of the enzyme.
38GlycosylationType of glycosylation (only if experimentally determined)
39Other featuresAny other information regarding the enzyme’s activity.
40GO (molecular)The GO term defining the molecular function of the enzyme
41Evidence (molecular)The type of information supporting the annotation of the molecular function of the enzyme.
42GO (process)The GO term defining the biological process the enzyme participates in.
43Evidence (process)The type of information supporting the annotation of the biological process. The biological process is only assigned the evidence code ‘Inferred by Direct Assay (IDA)’ when assayed on its natural substrate
44GO (component)The GO term defining the cellular compartment in which the enzyme acts.
45Evidence (component)The type of information supporting the enzyme’s component annotation.
Column NumberTitleType of information
1Entry nameThe unique identifier representing the enzyme. It incorporates the enzyme activity, the GH family it belongs to, and the phylogenetic origin of the enzyme.
2Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
3SpeciesThe genus and species of the enzyme’s natural host.
4StrainThe strain of fungus used to obtain the gene and/or enzyme.
5Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
6Gene aliasAny other names the gene is referred to in the literature or sequence databases.
7Enzyme nameThe name most commonly used to identify an enzyme of a specific activity type.
8Enzyme aliasAny other names the gene product (enzyme) is referred to in literature or public databases.
9Systematic nameThe systematic enzyme name according to the EC. <http://www.brenda-enzymes.org/>
10The EC numberA numerical classification of enzymes based on the reactions they catalyze. <http://www.chem.qmul.ac.uk/iubmb/>
11Gene ID (GenBank)The nucleotide sequence ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
12UniProt IDThe ID issued to each protein in the UniProt database. <http://www.uniprot.org/>
13Protein ID (GenBank)The protein ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
14Characterization literatureThe ID of the literature describing the enzyme’s characterization and properties recorded as PMID (PubMed ID) or CSFGID (Centre for Structural and Functional Genomics).
15Structure literatureThe PMID or CSFGID of any literature describing the structure of the enzyme if available.
16GH familyThe GH family the enzyme belongs to. <http://www.cazy.org/>
17AssayThe experiment used to determine the function and/or properties of the enzyme.
18Activity assay conditionsThe buffer, pH and temperature used in the assay to determine the enzyme activity
19Kinetic assay conditionsThe buffer, pH and temperature in which the Km, kcat and/or Vmax were determined.
20SubstratesThe chemical substrates used to assay that the enzyme was assayed on.
21Host (recombinant expression)The organism used to produce the recombinant enzyme for the experimental assay.
22Specific activityThe activity of the purified enzyme on the given substrate. Recorded in U/mg where 1 U (unit) = 1 μmol/min/mg = 16.67 nkat/mg.
23Substrate specificityThe activity of the enzyme on a given substrate compared to other substrates tested. Expressed as a percentage with the highest activity usually equal to 100%.
24KmThe Michaelis–Menten constant (Km) reflects the concentration of substrate at which initial velocity is one-half Vmax. Recorded in millimolar (mM) or milligrams/milliliter (mg/ml).
25kcatThe maximum number of reactions the enzyme catalyzes in one second (s−1).
26VmaxThe maximum velocity measured in U/mg at which an enzyme catalyzes a reaction. Reported in different ways, often as U/mg.
27pH optimumThe pH at which enzyme activity is maximal highest.
28pH stabilityThe pH range over which the enzyme is able to remain active. retains maximal activity (usually ≥80%) under the conditions defined in the paper
29Temperature optimumThe temperature (°C) at which enzyme activity is maximal.
30Temperature stabilityThe temperature (°C) beyond which the enzyme activity (usually ≥20%) is lost under the conditions defined in the study.
31Isoelectric point (theoretical)The pI of the enzyme calculated from its amino acid composition.
32Isoelectric point (experimental)The pI of the enzyme determined by isoelectric focusing.
33Molecular weight (theoretical)The molecular weight (kDa) of the enzyme calculated from its amino acid composition
34Molecular weight (experimental)The molecular weight (kDa) of the enzyme estimated using SDS–PAGE, gel filtration, etc.
35Protein lengthThe number of amino acids in the enzyme before cleavage of the signal peptide (unless stated otherwise).
36Signal peptideThe number of amino acids comprising the signal peptide, which targets the enzyme for secretion.
37CBDCarbohydrate binding domain if present as part of the enzyme.
38GlycosylationType of glycosylation (only if experimentally determined)
39Other featuresAny other information regarding the enzyme’s activity.
40GO (molecular)The GO term defining the molecular function of the enzyme
41Evidence (molecular)The type of information supporting the annotation of the molecular function of the enzyme.
42GO (process)The GO term defining the biological process the enzyme participates in.
43Evidence (process)The type of information supporting the annotation of the biological process. The biological process is only assigned the evidence code ‘Inferred by Direct Assay (IDA)’ when assayed on its natural substrate
44GO (component)The GO term defining the cellular compartment in which the enzyme acts.
45Evidence (component)The type of information supporting the enzyme’s component annotation.

This table lists the names of the columns used to organize the collected data in a spreadsheet. The type of data each heading encompasses is explained on the right.

Table 1.

The types of information extracted from the characterization literature

Column NumberTitleType of information
1Entry nameThe unique identifier representing the enzyme. It incorporates the enzyme activity, the GH family it belongs to, and the phylogenetic origin of the enzyme.
2Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
3SpeciesThe genus and species of the enzyme’s natural host.
4StrainThe strain of fungus used to obtain the gene and/or enzyme.
5Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
6Gene aliasAny other names the gene is referred to in the literature or sequence databases.
7Enzyme nameThe name most commonly used to identify an enzyme of a specific activity type.
8Enzyme aliasAny other names the gene product (enzyme) is referred to in literature or public databases.
9Systematic nameThe systematic enzyme name according to the EC. <http://www.brenda-enzymes.org/>
10The EC numberA numerical classification of enzymes based on the reactions they catalyze. <http://www.chem.qmul.ac.uk/iubmb/>
11Gene ID (GenBank)The nucleotide sequence ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
12UniProt IDThe ID issued to each protein in the UniProt database. <http://www.uniprot.org/>
13Protein ID (GenBank)The protein ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
14Characterization literatureThe ID of the literature describing the enzyme’s characterization and properties recorded as PMID (PubMed ID) or CSFGID (Centre for Structural and Functional Genomics).
15Structure literatureThe PMID or CSFGID of any literature describing the structure of the enzyme if available.
16GH familyThe GH family the enzyme belongs to. <http://www.cazy.org/>
17AssayThe experiment used to determine the function and/or properties of the enzyme.
18Activity assay conditionsThe buffer, pH and temperature used in the assay to determine the enzyme activity
19Kinetic assay conditionsThe buffer, pH and temperature in which the Km, kcat and/or Vmax were determined.
20SubstratesThe chemical substrates used to assay that the enzyme was assayed on.
21Host (recombinant expression)The organism used to produce the recombinant enzyme for the experimental assay.
22Specific activityThe activity of the purified enzyme on the given substrate. Recorded in U/mg where 1 U (unit) = 1 μmol/min/mg = 16.67 nkat/mg.
23Substrate specificityThe activity of the enzyme on a given substrate compared to other substrates tested. Expressed as a percentage with the highest activity usually equal to 100%.
24KmThe Michaelis–Menten constant (Km) reflects the concentration of substrate at which initial velocity is one-half Vmax. Recorded in millimolar (mM) or milligrams/milliliter (mg/ml).
25kcatThe maximum number of reactions the enzyme catalyzes in one second (s−1).
26VmaxThe maximum velocity measured in U/mg at which an enzyme catalyzes a reaction. Reported in different ways, often as U/mg.
27pH optimumThe pH at which enzyme activity is maximal highest.
28pH stabilityThe pH range over which the enzyme is able to remain active. retains maximal activity (usually ≥80%) under the conditions defined in the paper
29Temperature optimumThe temperature (°C) at which enzyme activity is maximal.
30Temperature stabilityThe temperature (°C) beyond which the enzyme activity (usually ≥20%) is lost under the conditions defined in the study.
31Isoelectric point (theoretical)The pI of the enzyme calculated from its amino acid composition.
32Isoelectric point (experimental)The pI of the enzyme determined by isoelectric focusing.
33Molecular weight (theoretical)The molecular weight (kDa) of the enzyme calculated from its amino acid composition
34Molecular weight (experimental)The molecular weight (kDa) of the enzyme estimated using SDS–PAGE, gel filtration, etc.
35Protein lengthThe number of amino acids in the enzyme before cleavage of the signal peptide (unless stated otherwise).
36Signal peptideThe number of amino acids comprising the signal peptide, which targets the enzyme for secretion.
37CBDCarbohydrate binding domain if present as part of the enzyme.
38GlycosylationType of glycosylation (only if experimentally determined)
39Other featuresAny other information regarding the enzyme’s activity.
40GO (molecular)The GO term defining the molecular function of the enzyme
41Evidence (molecular)The type of information supporting the annotation of the molecular function of the enzyme.
42GO (process)The GO term defining the biological process the enzyme participates in.
43Evidence (process)The type of information supporting the annotation of the biological process. The biological process is only assigned the evidence code ‘Inferred by Direct Assay (IDA)’ when assayed on its natural substrate
44GO (component)The GO term defining the cellular compartment in which the enzyme acts.
45Evidence (component)The type of information supporting the enzyme’s component annotation.
Column NumberTitleType of information
1Entry nameThe unique identifier representing the enzyme. It incorporates the enzyme activity, the GH family it belongs to, and the phylogenetic origin of the enzyme.
2Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
3SpeciesThe genus and species of the enzyme’s natural host.
4StrainThe strain of fungus used to obtain the gene and/or enzyme.
5Gene nameThe assigned gene name based on the standardized naming convention adopted for this study (see ‘Methods’ section).
6Gene aliasAny other names the gene is referred to in the literature or sequence databases.
7Enzyme nameThe name most commonly used to identify an enzyme of a specific activity type.
8Enzyme aliasAny other names the gene product (enzyme) is referred to in literature or public databases.
9Systematic nameThe systematic enzyme name according to the EC. <http://www.brenda-enzymes.org/>
10The EC numberA numerical classification of enzymes based on the reactions they catalyze. <http://www.chem.qmul.ac.uk/iubmb/>
11Gene ID (GenBank)The nucleotide sequence ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
12UniProt IDThe ID issued to each protein in the UniProt database. <http://www.uniprot.org/>
13Protein ID (GenBank)The protein ID issued by the GenBank database. <http://www.ncbi.nlm.nih.gov/>
14Characterization literatureThe ID of the literature describing the enzyme’s characterization and properties recorded as PMID (PubMed ID) or CSFGID (Centre for Structural and Functional Genomics).
15Structure literatureThe PMID or CSFGID of any literature describing the structure of the enzyme if available.
16GH familyThe GH family the enzyme belongs to. <http://www.cazy.org/>
17AssayThe experiment used to determine the function and/or properties of the enzyme.
18Activity assay conditionsThe buffer, pH and temperature used in the assay to determine the enzyme activity
19Kinetic assay conditionsThe buffer, pH and temperature in which the Km, kcat and/or Vmax were determined.
20SubstratesThe chemical substrates used to assay that the enzyme was assayed on.
21Host (recombinant expression)The organism used to produce the recombinant enzyme for the experimental assay.
22Specific activityThe activity of the purified enzyme on the given substrate. Recorded in U/mg where 1 U (unit) = 1 μmol/min/mg = 16.67 nkat/mg.
23Substrate specificityThe activity of the enzyme on a given substrate compared to other substrates tested. Expressed as a percentage with the highest activity usually equal to 100%.
24KmThe Michaelis–Menten constant (Km) reflects the concentration of substrate at which initial velocity is one-half Vmax. Recorded in millimolar (mM) or milligrams/milliliter (mg/ml).
25kcatThe maximum number of reactions the enzyme catalyzes in one second (s−1).
26VmaxThe maximum velocity measured in U/mg at which an enzyme catalyzes a reaction. Reported in different ways, often as U/mg.
27pH optimumThe pH at which enzyme activity is maximal highest.
28pH stabilityThe pH range over which the enzyme is able to remain active. retains maximal activity (usually ≥80%) under the conditions defined in the paper
29Temperature optimumThe temperature (°C) at which enzyme activity is maximal.
30Temperature stabilityThe temperature (°C) beyond which the enzyme activity (usually ≥20%) is lost under the conditions defined in the study.
31Isoelectric point (theoretical)The pI of the enzyme calculated from its amino acid composition.
32Isoelectric point (experimental)The pI of the enzyme determined by isoelectric focusing.
33Molecular weight (theoretical)The molecular weight (kDa) of the enzyme calculated from its amino acid composition
34Molecular weight (experimental)The molecular weight (kDa) of the enzyme estimated using SDS–PAGE, gel filtration, etc.
35Protein lengthThe number of amino acids in the enzyme before cleavage of the signal peptide (unless stated otherwise).
36Signal peptideThe number of amino acids comprising the signal peptide, which targets the enzyme for secretion.
37CBDCarbohydrate binding domain if present as part of the enzyme.
38GlycosylationType of glycosylation (only if experimentally determined)
39Other featuresAny other information regarding the enzyme’s activity.
40GO (molecular)The GO term defining the molecular function of the enzyme
41Evidence (molecular)The type of information supporting the annotation of the molecular function of the enzyme.
42GO (process)The GO term defining the biological process the enzyme participates in.
43Evidence (process)The type of information supporting the annotation of the biological process. The biological process is only assigned the evidence code ‘Inferred by Direct Assay (IDA)’ when assayed on its natural substrate
44GO (component)The GO term defining the cellular compartment in which the enzyme acts.
45Evidence (component)The type of information supporting the enzyme’s component annotation.

This table lists the names of the columns used to organize the collected data in a spreadsheet. The type of data each heading encompasses is explained on the right.

Assignment of standardized features

The Enzyme Commission (EC) and the Gene Ontology Project (GO) (23) <http://www.geneontology.org/index.shtml> developed EC numbers and GO terms, respectively. They are meant to standardize the functionality and characteristics of enzymes across all species. EC numbers were assigned based on the type of activity and substrate the enzyme acts on. GO terms are assigned based on the molecular function of the enzyme, the biological process that it acts in, and the cellular compartment where the enzyme is located.

Standardizing identifiers for genes and enzymes

Three-letter code for enzyme activity

In most of the articles, authors named genes using two- to three-letter codes representing the activity of the encoded protein followed by an assigned number or letter to distinguish each one from others of the same function or from the same species. Sometimes, several different letter codes have been used for the same enzyme activity. For example, ‘xyn’ (24), ‘xyl’ (25) and ‘xln’ (26) have all been used to describe xylanases. In other cases, the same letter code was used for enzymes with different activities. For example, ‘cel’ has been used for endoglucanase (27), xyloglucanase (28), β-glucosidase (29) and cellobiohydrolase (30) activities. To avoid confusion, we have adopted a single three-letter code for each enzyme activity; for example, xyn for endo-1,4-β-xylanase (xylanase). The most commonly used codes for GH family enzymes in the literature were adopted as the unique codes (Table 2). In the case of bifunctional enzymes, where two functional domains can clearly be discerned by sequence analysis, the three-letter code would start with ‘z’ followed by two letters representing the functional domains of the protein. An enzyme carrying an α-arabinofuranosidase domain and a xylosidase domain, for example, would be called ‘zax’.

Table 2.

Activities of the characterized GH

Enzyme nameCodeActivity
α-1,2-mannosidase (2-α-mannosyl-oligosaccharide α-d-mannohydrolase)MSDCatalyzes the hydrolysis of terminal, non-reducing-end glucose in mannosyl-oligosaccharides
α-amylase (4-α- d-glucan glucanohydrolase)AMYCatalyzes the hydrolysis of internal α-1,4-glucosidic linkages in polysaccharides and releases products in α-configuration
α-arabinofuranosidase (α-l-arabinofuranoside arabinofuranohydrolase)ABFCatalyzes terminal, non-reducing-end hydrolysis of α-l-arabinofuranoside residues
α-galactosidase (α-d-galactoside galactohydrolase)MELCatalyzes the hydrolysis of non-reducing-end α-d-galactose residues
α-glucosidase (4-α-d-glucohydrolase)AGLReleases glucose by catalyzing the hydrolysis of non-reducing-end α-d-glycosidic links
α-glucuronidase (α-d-glucosiduronate glucuronohydrolase)AGUCatalyzes the hydrolysis of glucuronic acid branches from hemicellulose
α-l-rhamnosidase (α-l-rhamnoside rhamnohydrolase)RHACatalyzes the hydrolysis of non-reducing-end α-l-rhamnoside residues
α-xylosidaseAGDCatalyzes the hydrolysis of terminal α-linked xylosides
Arabinogalactanase (arabinogalactan 4-β-d-galactanohydrolase)GANCatalyzes the hydrolysis of internal β-1,4-linked galactosidic linkages
Arabinoxylan–arabinofuranosidaseAXHCatalyzes the removal of arabinosides from xylan main chains
β-galactosidase (β-d-galactoside galactohydrolase)LACCatalyzes the hydrolysis of terminal, non-reducing-end β-d-galactose residues
β-glucosidase (β-d-glucoside glucohydrolase)BGLReleases glucose by acting on terminal, non-reducing-end β-d-glucosidic links
Beta-mannanase (4-β-d-mannan mannanohydrolase)MANCatalyzes the hydrolysis of β-1,4-mannosidic linkages in mannans, galactomannans and glucomannans
β-mannosidase (β-d-mannoside mannohydrolase)MNDCatalyzes the hydrolysis of terminal, non-reducing-end β-d-mannose from β-d-mannosides
β-xylosidase (4-β-d-xylan-xylohydrolase)XYLCatalyzes the hydrolysis of the bond joinholding xylose sugars together in xylobiose
Cellobiohydrolase (4-β-d-glucan cellobiohydrolase)CBHActs on non-reducing-end 1,4-β-d-glucosidic linkages to release cellobiose
Cellulase-enhancing proteinCEPExact function unknown but enhances hydrolysis of cellulose by cellulases
Chitinase ((1-4)-2-acetamido-2-deoxy-β-d-glucan glucanohydrolase)CHICatalyzes the random hydrolysis of N-acetyl-β-d-1,4-glucoaminide
Chitosanase (chitosan N-acetylglucosaminohydrolase)CSNCatalyzes the hydrolysis of β-1,4 linkages in acetylated chitosans
Dextranase (6-α-d-glucan 6-glucanohydrolase)DEXActs on 1,6-α-glucosidic linkages in dextrins
Endo-arabinanase (5-α-l-arabinan 5-α-l-arabinanohydrolase)ABNCatalyzes the hydrolysis of internal α-1,5-arabinofuranosidic linkages in arabinans
Endo-β-1,6-glucanase (6-β-d-glucan glucanohydrolase)BGNCatalyzes the random hydrolysis of β-1,6 linkages in β-1,6-linked glucans
Endo-β-N-acetylglucosaminidaseENDCatalyzes the removal of acetylated glycoprotein branches forming mannosyl-oligosaccharides
Endo-inulinase (1-β-d-fructan fructanohydrolase)INUCatalyzes the hydrolysis of internal fructosidic linkages in inulin
Endo-polygalacturonase (1,4-α-d-galacturonan glycanohydrolase)PGACatalyzes the random hydrolysis of 1,4-α-galactosiduronic linkages in pectate and galacturonans
Endo-rhamnogalacturonaseRHGCatalyzes the hydrolysis of links between galacturonic acid and rhamnopyranosyl residues in pectins
Endoglucanase (4-β-d-glucan 4-glucanohydrolase)EGLCatalyzes the hydrolysis of β-1,4-glucosidic linkages in cellulose
Exo-1,3-β-glucanase (3-β-d-glucan glucohydrolase)EXGCatalyzes the hydrolysis of glucose from the non-reducing-ends of β-1,3-glucans
Exo-arabinanaseARBCatalyzes the hydrolysis of α-1,5-arabinofuranosidic linkages from the ends of arabinans
Exo-glucosaminidase (Chitosan exo-1,4-β-d-glucoaminidase)GLSCatalyzes the hydrolysis of glucosamine residues from the non-reducing ends of chitosans
Exo-inulinase (β-d-fructan fructohydrolase)INXCatalyzes the hydrolysis of terminal, non-reducing 2,1- and 2,6-linked fructofuranose in fructans
Exo-polygalacturonase (poly{1,4-α-d-galacturonide} galacturonohydrolase)PGXCatalyzes the hydrolysis of d-galacturonate from the ends of galacturonides
Exo-rhamnogalacturonaseRGXCatalyzes the hydrolysis of rhamnoside residues from the ends of pectin
Galactanase (galactan endo-1,6-β-galactosidase)GALCatalyzes the hydrolysis of internal β-1,6-galactosidic linkages in arabinopgalactans and the hydrolysis of β-1,3- and β-1,6-galactosidic linkages in mixed galactans
Hexosaminidase (β-N-acetyl-d-hexosaminide N-acetylhexosaminohydrolase)HEXCatalyzes the hydrolysis of terminal, non-reducing-end N-acetyl-d-hexosamine residues
Invertase (β-d-fructofuranoside fructohydrolase)SUCCatalyzes the hydrolysis of β-d-fructofuranoside from the non-reducing ends of fructofuranosides
Isopullulanase (pullulan 4-glucanohydrolase)IPUCatalyzes the hydrolysis of pullulan to isopanose
Laminarinase (3-β-d-glucan glucanohydrolase)LAMCatalyzes the hydrolysis of β-1,3-glucosidic linkages in β-1,3-glucans
Licheninase (1,3-, 1,4-β-d-glucan 4-glucanohydrolase)LICCatalyzes the hydrolysis of β-1,4-glucosidic linkages in mixed-link glucans
Mixed-link glucanase (3(or 4)-β-d-glucan 3(4)-glucanohydrolase)MLGCatalyzes the hydrolysis of β-1,3 or β-1,4 linkages in mixed glucans when the glucose involved in the linkage is substituted at the 1,3 position
Mutanase (3-α-d-glucan 3-glucanohydrolase)MUTCatalyzes the internal hydrolysis of α-1,3-glycosidic linkages
Oligo-1,6-glucosidase (oligosaccharide 6-α-glucohydrolase)OGLCatalyzes the hydrolysis of 1,6-glycosidic linkages in oligosaccharides
Oligoxyloglucan cellobiohydrolase (oligoxyloglucan reducing-end cellobiohydrolase)XBHCatalyzes the hydrolysis of cellobiose from the reducing ends of xyloglucans with O-6 xylosyl substitutions on the second residue
Trehalase (α, α-trehalose glucohydrolase)TRECatalyzes the hydrolysis of trehalose to release two d-glucose residues
Xylanase (4-β-d-xylan xylanohydrolase)XYNActs on 1,4-β-xylosidic linkages in xylan
XylogalacturonaseXGHCatalyzes the hydrolysis of xylosyl substitutions on pectins
Xyloglucanase ([(1-6)-α-d-xylo]-(1-4)-β-d-glucan glucanohydrolase)XEGCatalyzes the hydrolysis of bonds involved in xyloglucan chains
Enzyme nameCodeActivity
α-1,2-mannosidase (2-α-mannosyl-oligosaccharide α-d-mannohydrolase)MSDCatalyzes the hydrolysis of terminal, non-reducing-end glucose in mannosyl-oligosaccharides
α-amylase (4-α- d-glucan glucanohydrolase)AMYCatalyzes the hydrolysis of internal α-1,4-glucosidic linkages in polysaccharides and releases products in α-configuration
α-arabinofuranosidase (α-l-arabinofuranoside arabinofuranohydrolase)ABFCatalyzes terminal, non-reducing-end hydrolysis of α-l-arabinofuranoside residues
α-galactosidase (α-d-galactoside galactohydrolase)MELCatalyzes the hydrolysis of non-reducing-end α-d-galactose residues
α-glucosidase (4-α-d-glucohydrolase)AGLReleases glucose by catalyzing the hydrolysis of non-reducing-end α-d-glycosidic links
α-glucuronidase (α-d-glucosiduronate glucuronohydrolase)AGUCatalyzes the hydrolysis of glucuronic acid branches from hemicellulose
α-l-rhamnosidase (α-l-rhamnoside rhamnohydrolase)RHACatalyzes the hydrolysis of non-reducing-end α-l-rhamnoside residues
α-xylosidaseAGDCatalyzes the hydrolysis of terminal α-linked xylosides
Arabinogalactanase (arabinogalactan 4-β-d-galactanohydrolase)GANCatalyzes the hydrolysis of internal β-1,4-linked galactosidic linkages
Arabinoxylan–arabinofuranosidaseAXHCatalyzes the removal of arabinosides from xylan main chains
β-galactosidase (β-d-galactoside galactohydrolase)LACCatalyzes the hydrolysis of terminal, non-reducing-end β-d-galactose residues
β-glucosidase (β-d-glucoside glucohydrolase)BGLReleases glucose by acting on terminal, non-reducing-end β-d-glucosidic links
Beta-mannanase (4-β-d-mannan mannanohydrolase)MANCatalyzes the hydrolysis of β-1,4-mannosidic linkages in mannans, galactomannans and glucomannans
β-mannosidase (β-d-mannoside mannohydrolase)MNDCatalyzes the hydrolysis of terminal, non-reducing-end β-d-mannose from β-d-mannosides
β-xylosidase (4-β-d-xylan-xylohydrolase)XYLCatalyzes the hydrolysis of the bond joinholding xylose sugars together in xylobiose
Cellobiohydrolase (4-β-d-glucan cellobiohydrolase)CBHActs on non-reducing-end 1,4-β-d-glucosidic linkages to release cellobiose
Cellulase-enhancing proteinCEPExact function unknown but enhances hydrolysis of cellulose by cellulases
Chitinase ((1-4)-2-acetamido-2-deoxy-β-d-glucan glucanohydrolase)CHICatalyzes the random hydrolysis of N-acetyl-β-d-1,4-glucoaminide
Chitosanase (chitosan N-acetylglucosaminohydrolase)CSNCatalyzes the hydrolysis of β-1,4 linkages in acetylated chitosans
Dextranase (6-α-d-glucan 6-glucanohydrolase)DEXActs on 1,6-α-glucosidic linkages in dextrins
Endo-arabinanase (5-α-l-arabinan 5-α-l-arabinanohydrolase)ABNCatalyzes the hydrolysis of internal α-1,5-arabinofuranosidic linkages in arabinans
Endo-β-1,6-glucanase (6-β-d-glucan glucanohydrolase)BGNCatalyzes the random hydrolysis of β-1,6 linkages in β-1,6-linked glucans
Endo-β-N-acetylglucosaminidaseENDCatalyzes the removal of acetylated glycoprotein branches forming mannosyl-oligosaccharides
Endo-inulinase (1-β-d-fructan fructanohydrolase)INUCatalyzes the hydrolysis of internal fructosidic linkages in inulin
Endo-polygalacturonase (1,4-α-d-galacturonan glycanohydrolase)PGACatalyzes the random hydrolysis of 1,4-α-galactosiduronic linkages in pectate and galacturonans
Endo-rhamnogalacturonaseRHGCatalyzes the hydrolysis of links between galacturonic acid and rhamnopyranosyl residues in pectins
Endoglucanase (4-β-d-glucan 4-glucanohydrolase)EGLCatalyzes the hydrolysis of β-1,4-glucosidic linkages in cellulose
Exo-1,3-β-glucanase (3-β-d-glucan glucohydrolase)EXGCatalyzes the hydrolysis of glucose from the non-reducing-ends of β-1,3-glucans
Exo-arabinanaseARBCatalyzes the hydrolysis of α-1,5-arabinofuranosidic linkages from the ends of arabinans
Exo-glucosaminidase (Chitosan exo-1,4-β-d-glucoaminidase)GLSCatalyzes the hydrolysis of glucosamine residues from the non-reducing ends of chitosans
Exo-inulinase (β-d-fructan fructohydrolase)INXCatalyzes the hydrolysis of terminal, non-reducing 2,1- and 2,6-linked fructofuranose in fructans
Exo-polygalacturonase (poly{1,4-α-d-galacturonide} galacturonohydrolase)PGXCatalyzes the hydrolysis of d-galacturonate from the ends of galacturonides
Exo-rhamnogalacturonaseRGXCatalyzes the hydrolysis of rhamnoside residues from the ends of pectin
Galactanase (galactan endo-1,6-β-galactosidase)GALCatalyzes the hydrolysis of internal β-1,6-galactosidic linkages in arabinopgalactans and the hydrolysis of β-1,3- and β-1,6-galactosidic linkages in mixed galactans
Hexosaminidase (β-N-acetyl-d-hexosaminide N-acetylhexosaminohydrolase)HEXCatalyzes the hydrolysis of terminal, non-reducing-end N-acetyl-d-hexosamine residues
Invertase (β-d-fructofuranoside fructohydrolase)SUCCatalyzes the hydrolysis of β-d-fructofuranoside from the non-reducing ends of fructofuranosides
Isopullulanase (pullulan 4-glucanohydrolase)IPUCatalyzes the hydrolysis of pullulan to isopanose
Laminarinase (3-β-d-glucan glucanohydrolase)LAMCatalyzes the hydrolysis of β-1,3-glucosidic linkages in β-1,3-glucans
Licheninase (1,3-, 1,4-β-d-glucan 4-glucanohydrolase)LICCatalyzes the hydrolysis of β-1,4-glucosidic linkages in mixed-link glucans
Mixed-link glucanase (3(or 4)-β-d-glucan 3(4)-glucanohydrolase)MLGCatalyzes the hydrolysis of β-1,3 or β-1,4 linkages in mixed glucans when the glucose involved in the linkage is substituted at the 1,3 position
Mutanase (3-α-d-glucan 3-glucanohydrolase)MUTCatalyzes the internal hydrolysis of α-1,3-glycosidic linkages
Oligo-1,6-glucosidase (oligosaccharide 6-α-glucohydrolase)OGLCatalyzes the hydrolysis of 1,6-glycosidic linkages in oligosaccharides
Oligoxyloglucan cellobiohydrolase (oligoxyloglucan reducing-end cellobiohydrolase)XBHCatalyzes the hydrolysis of cellobiose from the reducing ends of xyloglucans with O-6 xylosyl substitutions on the second residue
Trehalase (α, α-trehalose glucohydrolase)TRECatalyzes the hydrolysis of trehalose to release two d-glucose residues
Xylanase (4-β-d-xylan xylanohydrolase)XYNActs on 1,4-β-xylosidic linkages in xylan
XylogalacturonaseXGHCatalyzes the hydrolysis of xylosyl substitutions on pectins
Xyloglucanase ([(1-6)-α-d-xylo]-(1-4)-β-d-glucan glucanohydrolase)XEGCatalyzes the hydrolysis of bonds involved in xyloglucan chains

This table lists the different enzyme activities collected in the literature survey. A combination of BRENDA <http://www.brenda-enzymes.org/> and The GO <http://www.geneontology.org/> were used to give a definition of each activity type and alternate names. The common, simpler enzyme name is used followed by the systematic name used by BRENDA. A three-letter code was used to represent the activity of the enzyme in the gene name and entry name. Codes were selected based on the most commonly used code for a particular activity in the literature. These codes were used in the standardized naming process.

Table 2.

Activities of the characterized GH

Enzyme nameCodeActivity
α-1,2-mannosidase (2-α-mannosyl-oligosaccharide α-d-mannohydrolase)MSDCatalyzes the hydrolysis of terminal, non-reducing-end glucose in mannosyl-oligosaccharides
α-amylase (4-α- d-glucan glucanohydrolase)AMYCatalyzes the hydrolysis of internal α-1,4-glucosidic linkages in polysaccharides and releases products in α-configuration
α-arabinofuranosidase (α-l-arabinofuranoside arabinofuranohydrolase)ABFCatalyzes terminal, non-reducing-end hydrolysis of α-l-arabinofuranoside residues
α-galactosidase (α-d-galactoside galactohydrolase)MELCatalyzes the hydrolysis of non-reducing-end α-d-galactose residues
α-glucosidase (4-α-d-glucohydrolase)AGLReleases glucose by catalyzing the hydrolysis of non-reducing-end α-d-glycosidic links
α-glucuronidase (α-d-glucosiduronate glucuronohydrolase)AGUCatalyzes the hydrolysis of glucuronic acid branches from hemicellulose
α-l-rhamnosidase (α-l-rhamnoside rhamnohydrolase)RHACatalyzes the hydrolysis of non-reducing-end α-l-rhamnoside residues
α-xylosidaseAGDCatalyzes the hydrolysis of terminal α-linked xylosides
Arabinogalactanase (arabinogalactan 4-β-d-galactanohydrolase)GANCatalyzes the hydrolysis of internal β-1,4-linked galactosidic linkages
Arabinoxylan–arabinofuranosidaseAXHCatalyzes the removal of arabinosides from xylan main chains
β-galactosidase (β-d-galactoside galactohydrolase)LACCatalyzes the hydrolysis of terminal, non-reducing-end β-d-galactose residues
β-glucosidase (β-d-glucoside glucohydrolase)BGLReleases glucose by acting on terminal, non-reducing-end β-d-glucosidic links
Beta-mannanase (4-β-d-mannan mannanohydrolase)MANCatalyzes the hydrolysis of β-1,4-mannosidic linkages in mannans, galactomannans and glucomannans
β-mannosidase (β-d-mannoside mannohydrolase)MNDCatalyzes the hydrolysis of terminal, non-reducing-end β-d-mannose from β-d-mannosides
β-xylosidase (4-β-d-xylan-xylohydrolase)XYLCatalyzes the hydrolysis of the bond joinholding xylose sugars together in xylobiose
Cellobiohydrolase (4-β-d-glucan cellobiohydrolase)CBHActs on non-reducing-end 1,4-β-d-glucosidic linkages to release cellobiose
Cellulase-enhancing proteinCEPExact function unknown but enhances hydrolysis of cellulose by cellulases
Chitinase ((1-4)-2-acetamido-2-deoxy-β-d-glucan glucanohydrolase)CHICatalyzes the random hydrolysis of N-acetyl-β-d-1,4-glucoaminide
Chitosanase (chitosan N-acetylglucosaminohydrolase)CSNCatalyzes the hydrolysis of β-1,4 linkages in acetylated chitosans
Dextranase (6-α-d-glucan 6-glucanohydrolase)DEXActs on 1,6-α-glucosidic linkages in dextrins
Endo-arabinanase (5-α-l-arabinan 5-α-l-arabinanohydrolase)ABNCatalyzes the hydrolysis of internal α-1,5-arabinofuranosidic linkages in arabinans
Endo-β-1,6-glucanase (6-β-d-glucan glucanohydrolase)BGNCatalyzes the random hydrolysis of β-1,6 linkages in β-1,6-linked glucans
Endo-β-N-acetylglucosaminidaseENDCatalyzes the removal of acetylated glycoprotein branches forming mannosyl-oligosaccharides
Endo-inulinase (1-β-d-fructan fructanohydrolase)INUCatalyzes the hydrolysis of internal fructosidic linkages in inulin
Endo-polygalacturonase (1,4-α-d-galacturonan glycanohydrolase)PGACatalyzes the random hydrolysis of 1,4-α-galactosiduronic linkages in pectate and galacturonans
Endo-rhamnogalacturonaseRHGCatalyzes the hydrolysis of links between galacturonic acid and rhamnopyranosyl residues in pectins
Endoglucanase (4-β-d-glucan 4-glucanohydrolase)EGLCatalyzes the hydrolysis of β-1,4-glucosidic linkages in cellulose
Exo-1,3-β-glucanase (3-β-d-glucan glucohydrolase)EXGCatalyzes the hydrolysis of glucose from the non-reducing-ends of β-1,3-glucans
Exo-arabinanaseARBCatalyzes the hydrolysis of α-1,5-arabinofuranosidic linkages from the ends of arabinans
Exo-glucosaminidase (Chitosan exo-1,4-β-d-glucoaminidase)GLSCatalyzes the hydrolysis of glucosamine residues from the non-reducing ends of chitosans
Exo-inulinase (β-d-fructan fructohydrolase)INXCatalyzes the hydrolysis of terminal, non-reducing 2,1- and 2,6-linked fructofuranose in fructans
Exo-polygalacturonase (poly{1,4-α-d-galacturonide} galacturonohydrolase)PGXCatalyzes the hydrolysis of d-galacturonate from the ends of galacturonides
Exo-rhamnogalacturonaseRGXCatalyzes the hydrolysis of rhamnoside residues from the ends of pectin
Galactanase (galactan endo-1,6-β-galactosidase)GALCatalyzes the hydrolysis of internal β-1,6-galactosidic linkages in arabinopgalactans and the hydrolysis of β-1,3- and β-1,6-galactosidic linkages in mixed galactans
Hexosaminidase (β-N-acetyl-d-hexosaminide N-acetylhexosaminohydrolase)HEXCatalyzes the hydrolysis of terminal, non-reducing-end N-acetyl-d-hexosamine residues
Invertase (β-d-fructofuranoside fructohydrolase)SUCCatalyzes the hydrolysis of β-d-fructofuranoside from the non-reducing ends of fructofuranosides
Isopullulanase (pullulan 4-glucanohydrolase)IPUCatalyzes the hydrolysis of pullulan to isopanose
Laminarinase (3-β-d-glucan glucanohydrolase)LAMCatalyzes the hydrolysis of β-1,3-glucosidic linkages in β-1,3-glucans
Licheninase (1,3-, 1,4-β-d-glucan 4-glucanohydrolase)LICCatalyzes the hydrolysis of β-1,4-glucosidic linkages in mixed-link glucans
Mixed-link glucanase (3(or 4)-β-d-glucan 3(4)-glucanohydrolase)MLGCatalyzes the hydrolysis of β-1,3 or β-1,4 linkages in mixed glucans when the glucose involved in the linkage is substituted at the 1,3 position
Mutanase (3-α-d-glucan 3-glucanohydrolase)MUTCatalyzes the internal hydrolysis of α-1,3-glycosidic linkages
Oligo-1,6-glucosidase (oligosaccharide 6-α-glucohydrolase)OGLCatalyzes the hydrolysis of 1,6-glycosidic linkages in oligosaccharides
Oligoxyloglucan cellobiohydrolase (oligoxyloglucan reducing-end cellobiohydrolase)XBHCatalyzes the hydrolysis of cellobiose from the reducing ends of xyloglucans with O-6 xylosyl substitutions on the second residue
Trehalase (α, α-trehalose glucohydrolase)TRECatalyzes the hydrolysis of trehalose to release two d-glucose residues
Xylanase (4-β-d-xylan xylanohydrolase)XYNActs on 1,4-β-xylosidic linkages in xylan
XylogalacturonaseXGHCatalyzes the hydrolysis of xylosyl substitutions on pectins
Xyloglucanase ([(1-6)-α-d-xylo]-(1-4)-β-d-glucan glucanohydrolase)XEGCatalyzes the hydrolysis of bonds involved in xyloglucan chains
Enzyme nameCodeActivity
α-1,2-mannosidase (2-α-mannosyl-oligosaccharide α-d-mannohydrolase)MSDCatalyzes the hydrolysis of terminal, non-reducing-end glucose in mannosyl-oligosaccharides
α-amylase (4-α- d-glucan glucanohydrolase)AMYCatalyzes the hydrolysis of internal α-1,4-glucosidic linkages in polysaccharides and releases products in α-configuration
α-arabinofuranosidase (α-l-arabinofuranoside arabinofuranohydrolase)ABFCatalyzes terminal, non-reducing-end hydrolysis of α-l-arabinofuranoside residues
α-galactosidase (α-d-galactoside galactohydrolase)MELCatalyzes the hydrolysis of non-reducing-end α-d-galactose residues
α-glucosidase (4-α-d-glucohydrolase)AGLReleases glucose by catalyzing the hydrolysis of non-reducing-end α-d-glycosidic links
α-glucuronidase (α-d-glucosiduronate glucuronohydrolase)AGUCatalyzes the hydrolysis of glucuronic acid branches from hemicellulose
α-l-rhamnosidase (α-l-rhamnoside rhamnohydrolase)RHACatalyzes the hydrolysis of non-reducing-end α-l-rhamnoside residues
α-xylosidaseAGDCatalyzes the hydrolysis of terminal α-linked xylosides
Arabinogalactanase (arabinogalactan 4-β-d-galactanohydrolase)GANCatalyzes the hydrolysis of internal β-1,4-linked galactosidic linkages
Arabinoxylan–arabinofuranosidaseAXHCatalyzes the removal of arabinosides from xylan main chains
β-galactosidase (β-d-galactoside galactohydrolase)LACCatalyzes the hydrolysis of terminal, non-reducing-end β-d-galactose residues
β-glucosidase (β-d-glucoside glucohydrolase)BGLReleases glucose by acting on terminal, non-reducing-end β-d-glucosidic links
Beta-mannanase (4-β-d-mannan mannanohydrolase)MANCatalyzes the hydrolysis of β-1,4-mannosidic linkages in mannans, galactomannans and glucomannans
β-mannosidase (β-d-mannoside mannohydrolase)MNDCatalyzes the hydrolysis of terminal, non-reducing-end β-d-mannose from β-d-mannosides
β-xylosidase (4-β-d-xylan-xylohydrolase)XYLCatalyzes the hydrolysis of the bond joinholding xylose sugars together in xylobiose
Cellobiohydrolase (4-β-d-glucan cellobiohydrolase)CBHActs on non-reducing-end 1,4-β-d-glucosidic linkages to release cellobiose
Cellulase-enhancing proteinCEPExact function unknown but enhances hydrolysis of cellulose by cellulases
Chitinase ((1-4)-2-acetamido-2-deoxy-β-d-glucan glucanohydrolase)CHICatalyzes the random hydrolysis of N-acetyl-β-d-1,4-glucoaminide
Chitosanase (chitosan N-acetylglucosaminohydrolase)CSNCatalyzes the hydrolysis of β-1,4 linkages in acetylated chitosans
Dextranase (6-α-d-glucan 6-glucanohydrolase)DEXActs on 1,6-α-glucosidic linkages in dextrins
Endo-arabinanase (5-α-l-arabinan 5-α-l-arabinanohydrolase)ABNCatalyzes the hydrolysis of internal α-1,5-arabinofuranosidic linkages in arabinans
Endo-β-1,6-glucanase (6-β-d-glucan glucanohydrolase)BGNCatalyzes the random hydrolysis of β-1,6 linkages in β-1,6-linked glucans
Endo-β-N-acetylglucosaminidaseENDCatalyzes the removal of acetylated glycoprotein branches forming mannosyl-oligosaccharides
Endo-inulinase (1-β-d-fructan fructanohydrolase)INUCatalyzes the hydrolysis of internal fructosidic linkages in inulin
Endo-polygalacturonase (1,4-α-d-galacturonan glycanohydrolase)PGACatalyzes the random hydrolysis of 1,4-α-galactosiduronic linkages in pectate and galacturonans
Endo-rhamnogalacturonaseRHGCatalyzes the hydrolysis of links between galacturonic acid and rhamnopyranosyl residues in pectins
Endoglucanase (4-β-d-glucan 4-glucanohydrolase)EGLCatalyzes the hydrolysis of β-1,4-glucosidic linkages in cellulose
Exo-1,3-β-glucanase (3-β-d-glucan glucohydrolase)EXGCatalyzes the hydrolysis of glucose from the non-reducing-ends of β-1,3-glucans
Exo-arabinanaseARBCatalyzes the hydrolysis of α-1,5-arabinofuranosidic linkages from the ends of arabinans
Exo-glucosaminidase (Chitosan exo-1,4-β-d-glucoaminidase)GLSCatalyzes the hydrolysis of glucosamine residues from the non-reducing ends of chitosans
Exo-inulinase (β-d-fructan fructohydrolase)INXCatalyzes the hydrolysis of terminal, non-reducing 2,1- and 2,6-linked fructofuranose in fructans
Exo-polygalacturonase (poly{1,4-α-d-galacturonide} galacturonohydrolase)PGXCatalyzes the hydrolysis of d-galacturonate from the ends of galacturonides
Exo-rhamnogalacturonaseRGXCatalyzes the hydrolysis of rhamnoside residues from the ends of pectin
Galactanase (galactan endo-1,6-β-galactosidase)GALCatalyzes the hydrolysis of internal β-1,6-galactosidic linkages in arabinopgalactans and the hydrolysis of β-1,3- and β-1,6-galactosidic linkages in mixed galactans
Hexosaminidase (β-N-acetyl-d-hexosaminide N-acetylhexosaminohydrolase)HEXCatalyzes the hydrolysis of terminal, non-reducing-end N-acetyl-d-hexosamine residues
Invertase (β-d-fructofuranoside fructohydrolase)SUCCatalyzes the hydrolysis of β-d-fructofuranoside from the non-reducing ends of fructofuranosides
Isopullulanase (pullulan 4-glucanohydrolase)IPUCatalyzes the hydrolysis of pullulan to isopanose
Laminarinase (3-β-d-glucan glucanohydrolase)LAMCatalyzes the hydrolysis of β-1,3-glucosidic linkages in β-1,3-glucans
Licheninase (1,3-, 1,4-β-d-glucan 4-glucanohydrolase)LICCatalyzes the hydrolysis of β-1,4-glucosidic linkages in mixed-link glucans
Mixed-link glucanase (3(or 4)-β-d-glucan 3(4)-glucanohydrolase)MLGCatalyzes the hydrolysis of β-1,3 or β-1,4 linkages in mixed glucans when the glucose involved in the linkage is substituted at the 1,3 position
Mutanase (3-α-d-glucan 3-glucanohydrolase)MUTCatalyzes the internal hydrolysis of α-1,3-glycosidic linkages
Oligo-1,6-glucosidase (oligosaccharide 6-α-glucohydrolase)OGLCatalyzes the hydrolysis of 1,6-glycosidic linkages in oligosaccharides
Oligoxyloglucan cellobiohydrolase (oligoxyloglucan reducing-end cellobiohydrolase)XBHCatalyzes the hydrolysis of cellobiose from the reducing ends of xyloglucans with O-6 xylosyl substitutions on the second residue
Trehalase (α, α-trehalose glucohydrolase)TRECatalyzes the hydrolysis of trehalose to release two d-glucose residues
Xylanase (4-β-d-xylan xylanohydrolase)XYNActs on 1,4-β-xylosidic linkages in xylan
XylogalacturonaseXGHCatalyzes the hydrolysis of xylosyl substitutions on pectins
Xyloglucanase ([(1-6)-α-d-xylo]-(1-4)-β-d-glucan glucanohydrolase)XEGCatalyzes the hydrolysis of bonds involved in xyloglucan chains

This table lists the different enzyme activities collected in the literature survey. A combination of BRENDA <http://www.brenda-enzymes.org/> and The GO <http://www.geneontology.org/> were used to give a definition of each activity type and alternate names. The common, simpler enzyme name is used followed by the systematic name used by BRENDA. A three-letter code was used to represent the activity of the enzyme in the gene name and entry name. Codes were selected based on the most commonly used code for a particular activity in the literature. These codes were used in the standardized naming process.

Gene name

The following format was used to standardize the assignment of gene names. The three-letter code of the enzyme activity is followed by a number, which represents the GH family to which the enzyme belongs. Finally, a letter is added to distinguish the different genes of the same species encoding the same enzyme function from the same family. If the gene name given in the literature included a letter, that letter was kept in the standardized name. If the given gene name included a number, it was converted to the corresponding letter. For example, xyn2 from GH family 11 would become xyn11B, while bgl5 from GH family 3 would have become bgl3E and so on. When the same gene name had been given to multiple genes from the same species and family, their sequences were aligned to make sure they were the same sequence. If the genes were found to encode different enzymes, the letter component of the gene name was assigned according to the publication date of the literature. Thus, the letter ‘A’ (or the first available letter if ‘A’ was taken) represents the enzyme with the earliest published characterization data.

Entry identifier

To make each gene entry unique, a naming method similar to that of UniProt was used. A five-letter code representing the natural host of the enzyme was added onto the end preceded by an underscore. The first three letters were used to represent the genus of the fungus, followed by two letters representing the species (Table 3). For example, XYN11A_TRIRE would represent the GH11 xylanase gene, ‘xynA’, from Trichoderma reesei. If the letters were the same for different species, Penicillium janthinellum and Penicillium janczewskii, for example (PENJA), another unique letter from the species name was used. In the case of Penicillium janthinellum and Penicillium janczewskii, the entries would be PENJA and PENJZ, respectively.

Table 3.

Fungal species having characterized glycoside hydrolases

CodeNumber of enzymes characterizedAlternate names
Ascomycota species
    Acremonium blochiiACRBL1
    Acrophialophora nainianaACRNA1
    Aphanocladium albumAPHAL1
    Arxula adeninivoransARXAD2
    Aspergillus aculeatusASPAC9
    Aspergillus awamoriASPAW11
    Aspergillus flavusASPFL2
    Aspergillus fumigatusASPFU5Sartorya fumigata
    Aspergillus kawachiiASPKA11Aspergillus awamori var. kawachii
    Aspergillus nigerASPNG47
    Aspergillus oryzaeASPOR16
    Aspergillus phoenicisASPPH1Aspergillus saitoi
    Aspergillus shirousamiASPSH2
    Aspergillus sojaeASPSO1
    Aspergillus speciesASPSP1
    Aspergillus sulphureusASPSU2
    Aspergillus terreusASPTE2
    Aspergillus tubingensisASPTU6
    Aureobasidium pullulansAURPU4
    Bionectria ochroleucaBIOOC4Gliocladium roseum
    Bispora sp. MEY-1BISSP1
    Botryotinia fuckelianaBOTFU7Botrytis cinerea, Noble-rot fungus
    Candida albicansCANAL6
    Candida oleophilaCANOL1
    Candida tsukubaensisCANTS1
    Candida wickerhamiiCANWI1
    Chaetomium brasilienseCHABR1
    Chaetomium gracileCHAGR2
    Chaetomium thermophilumCHATH1
    Claviceps purpureaCLAPU2
    Coccidioides immitisCOCIM2Valley Fever Fungus
    Cochliobolus carbonumCOCCA8Bipolaris zeicola
    Cochliobolus sativusCOCSA1Bipolaris sorokinia
    Cryphonectria parasiticaCRYPA1Endothia parasitica, Chestnut Blight Fungus
    Daldinia eschscholziiDALES1
    Debaryomyces occidentalisDEBOC3
    Emericella desertorumEMEDE1
    Emericella nidulansEMENI34Aspergillus nidulans
    Fusarium equisetiFUSEQ1Fusarium scirpi
    Fusarium oxysporumFUSOX2Panama Disease Fungus
    Fusarium solaniiFUSSO3Nectria ipomoeae
    Geotrichum speciesGEOSP2Fermentotrichon, Oosporoidea, Polymorphomyces
    Gibberella species 75GIBSP1
    Gibberella zeaeGIBZE3Fusarium graminearum, Wheat Head Blight Fungus
    Hansenula anomalaHANAN1Candida pelliculosa
    Hormoconis resinaeHORRE1Creosote fungus, Amorphotheca resinae
    Humicola grisea var. thermoideaHUMGT4
    Humicola insolensHUMIN6
    Hypocrea schweintziiHYPSC1
    Isaria javanicusISAJA1Paecilomyces javanicus
    Kluyveromyces lactisKLULA3Candida sphaerica
    Kluyveromyces marxianusKLUMA2Candida kefyr
    Kuraishia molischianaKURMO1Pichia capsulata
    Lipomyces konoenkoeLIPKO2
    Lipomyces starkeyiLIPST1Oleaginous yeast
    Magnaporthe griseaMAGGR5Pyricularia grisea, Rice Blast Fungus
    Melanocarpus albomycesMELAO3
    Metarhizium anisopliaeMETAN1
    Neotyphodium speciesNEOSP1
    Neurospora crassaNEUCR1
    Paecilomyces thermophilaPAETH1
    Penicillium brasilianumPENBR1
    Penicillium canescensPENCA1
    Penicillium chrysogenumPENCH3Penicillium notatum
    Penicillium citrinumPENCI3
    Penicillium enchinulatumPENEN1
    Penicillium funiculosumPENFN6
    Penicillium janthinellumPENJA2Penicillium vitale
    Penicillium minioluteumPENMI2
    Penicillium olsoniiPENOL2
    Penicillium purpurogenumPENPU6
    Penicillium simplicissimumPENSI1
    Penicillium speciesPENSQ8
    Periconia speciesPERSP1
    Pichia angustaPICAN2Hansenula polymorpha
    Pichia jadiniiPICJA1Candida utilis
    Pichia pastorisPICPA1
    Robillarda speciesROBSP1
    Saccharomyces cerevisiaeYEAST9Baker’s Yeast
    Saccharomycopsis fibuligeraSACFI3
    Schizosaccharomyces pombeSCHPO5
    Stachybotrys echinataSTAEC1
    Talaromyces emersoniiTALEM2
    Thermoascus aurantiacusTHEAU5
    Thermomyces lanuginosusTHELA2Humicola lanuginosa
    Thielavia heterothallicaTHIHE1
    Thielavia terrestrisTHITE1Acremonium alabamense
    Trichoderma asperellumTRIAS3
    Trichoderma harzianumTRIHA11Hypocrea lisii
    Trichoderma koningiiTRIKO2Hypocrea koningii
    Trichoderma longibrachiatumTRILO1
    Trichoderma reeseiTRIRE23Hypocrea jecorina
    Trichoderma speciesTRISP1
    Trichoderma virideTRIVI2
    Verticillium dahliaeVERDA1Verticillium Wilt Fungus
    Yarrowia lipolyticaYARLI1Candida lipolytica
Basidiomycota species
    Agaricus bisporusAGABI4Common Mushroom
    Athelia rolfsiiATHRO2Sclerotinia rolfsii, Corticium rolfsii
    Chondrostereum purpureumCHOPU1Stereum purpureum
    Coprinopsis cinereaCOPCI3Hormographiella aspergillata, Inky Cap Fungus
    Cryptococcus albidusCRYAL1Fiobasidium floriforma
    Cryptococcus flavusCRYFL1
    Cryptococcus speciesCRYSP2
    Fomitopsis palustrisFOMPA2
    Fomitopsis pinicolaFOMPI1
    Irpex lacteusIRPLA6Polyporus Tulipoferae, Milk-white Toothed Polypore
    Meripilus giganteusMERGI1
    Phaffia rhdozymaPHARA2Xanthophyllomyces dendrohous
    Phanerochaete chrysosporiumPHACH11Sporotrichum prunosum
    Schizophyllum communeSCHCO1Bracket Fungus
    Sporobolomyces singularisSPOSI1
    Trametes hirsutaTRAHI1
    Uromyces fabaeUROFA1Rust Fungus
Mucoromycotinia species
    Gongronella speciesGONSP1
    Mortierella alliaceaMORAL1
    Mucor circinelloidesMUCCI2Mucor griseo-roseus
    Mucor hiemalisMUCHI1
    Mucor javanicusMUCJA1
    Mycocladus corymbiferusMYCCO1Absidia corymbiferus
    Phycomyces nitensPHYNI1
    Rhizopus oligosporusRHIOL1
    Rhizopus oryzaeRHIOR17Rhizopus delemar
    Rhizopus speciesRHISP1
    Syncephalastrum racemosumSYNRA1
Neocallimastigomycota species
    Neocallimastix frontalisNEOFR1
    Neocallimastix patriciarumNEOPA4
    Orpinomyces joyoniiORPJO1
    Orpinomyces speciesORPSP3
    Piromyces equiPIREQ2
    Piromyces speciesPIRSP7
CodeNumber of enzymes characterizedAlternate names
Ascomycota species
    Acremonium blochiiACRBL1
    Acrophialophora nainianaACRNA1
    Aphanocladium albumAPHAL1
    Arxula adeninivoransARXAD2
    Aspergillus aculeatusASPAC9
    Aspergillus awamoriASPAW11
    Aspergillus flavusASPFL2
    Aspergillus fumigatusASPFU5Sartorya fumigata
    Aspergillus kawachiiASPKA11Aspergillus awamori var. kawachii
    Aspergillus nigerASPNG47
    Aspergillus oryzaeASPOR16
    Aspergillus phoenicisASPPH1Aspergillus saitoi
    Aspergillus shirousamiASPSH2
    Aspergillus sojaeASPSO1
    Aspergillus speciesASPSP1
    Aspergillus sulphureusASPSU2
    Aspergillus terreusASPTE2
    Aspergillus tubingensisASPTU6
    Aureobasidium pullulansAURPU4
    Bionectria ochroleucaBIOOC4Gliocladium roseum
    Bispora sp. MEY-1BISSP1
    Botryotinia fuckelianaBOTFU7Botrytis cinerea, Noble-rot fungus
    Candida albicansCANAL6
    Candida oleophilaCANOL1
    Candida tsukubaensisCANTS1
    Candida wickerhamiiCANWI1
    Chaetomium brasilienseCHABR1
    Chaetomium gracileCHAGR2
    Chaetomium thermophilumCHATH1
    Claviceps purpureaCLAPU2
    Coccidioides immitisCOCIM2Valley Fever Fungus
    Cochliobolus carbonumCOCCA8Bipolaris zeicola
    Cochliobolus sativusCOCSA1Bipolaris sorokinia
    Cryphonectria parasiticaCRYPA1Endothia parasitica, Chestnut Blight Fungus
    Daldinia eschscholziiDALES1
    Debaryomyces occidentalisDEBOC3
    Emericella desertorumEMEDE1
    Emericella nidulansEMENI34Aspergillus nidulans
    Fusarium equisetiFUSEQ1Fusarium scirpi
    Fusarium oxysporumFUSOX2Panama Disease Fungus
    Fusarium solaniiFUSSO3Nectria ipomoeae
    Geotrichum speciesGEOSP2Fermentotrichon, Oosporoidea, Polymorphomyces
    Gibberella species 75GIBSP1
    Gibberella zeaeGIBZE3Fusarium graminearum, Wheat Head Blight Fungus
    Hansenula anomalaHANAN1Candida pelliculosa
    Hormoconis resinaeHORRE1Creosote fungus, Amorphotheca resinae
    Humicola grisea var. thermoideaHUMGT4
    Humicola insolensHUMIN6
    Hypocrea schweintziiHYPSC1
    Isaria javanicusISAJA1Paecilomyces javanicus
    Kluyveromyces lactisKLULA3Candida sphaerica
    Kluyveromyces marxianusKLUMA2Candida kefyr
    Kuraishia molischianaKURMO1Pichia capsulata
    Lipomyces konoenkoeLIPKO2
    Lipomyces starkeyiLIPST1Oleaginous yeast
    Magnaporthe griseaMAGGR5Pyricularia grisea, Rice Blast Fungus
    Melanocarpus albomycesMELAO3
    Metarhizium anisopliaeMETAN1
    Neotyphodium speciesNEOSP1
    Neurospora crassaNEUCR1
    Paecilomyces thermophilaPAETH1
    Penicillium brasilianumPENBR1
    Penicillium canescensPENCA1
    Penicillium chrysogenumPENCH3Penicillium notatum
    Penicillium citrinumPENCI3
    Penicillium enchinulatumPENEN1
    Penicillium funiculosumPENFN6
    Penicillium janthinellumPENJA2Penicillium vitale
    Penicillium minioluteumPENMI2
    Penicillium olsoniiPENOL2
    Penicillium purpurogenumPENPU6
    Penicillium simplicissimumPENSI1
    Penicillium speciesPENSQ8
    Periconia speciesPERSP1
    Pichia angustaPICAN2Hansenula polymorpha
    Pichia jadiniiPICJA1Candida utilis
    Pichia pastorisPICPA1
    Robillarda speciesROBSP1
    Saccharomyces cerevisiaeYEAST9Baker’s Yeast
    Saccharomycopsis fibuligeraSACFI3
    Schizosaccharomyces pombeSCHPO5
    Stachybotrys echinataSTAEC1
    Talaromyces emersoniiTALEM2
    Thermoascus aurantiacusTHEAU5
    Thermomyces lanuginosusTHELA2Humicola lanuginosa
    Thielavia heterothallicaTHIHE1
    Thielavia terrestrisTHITE1Acremonium alabamense
    Trichoderma asperellumTRIAS3
    Trichoderma harzianumTRIHA11Hypocrea lisii
    Trichoderma koningiiTRIKO2Hypocrea koningii
    Trichoderma longibrachiatumTRILO1
    Trichoderma reeseiTRIRE23Hypocrea jecorina
    Trichoderma speciesTRISP1
    Trichoderma virideTRIVI2
    Verticillium dahliaeVERDA1Verticillium Wilt Fungus
    Yarrowia lipolyticaYARLI1Candida lipolytica
Basidiomycota species
    Agaricus bisporusAGABI4Common Mushroom
    Athelia rolfsiiATHRO2Sclerotinia rolfsii, Corticium rolfsii
    Chondrostereum purpureumCHOPU1Stereum purpureum
    Coprinopsis cinereaCOPCI3Hormographiella aspergillata, Inky Cap Fungus
    Cryptococcus albidusCRYAL1Fiobasidium floriforma
    Cryptococcus flavusCRYFL1
    Cryptococcus speciesCRYSP2
    Fomitopsis palustrisFOMPA2
    Fomitopsis pinicolaFOMPI1
    Irpex lacteusIRPLA6Polyporus Tulipoferae, Milk-white Toothed Polypore
    Meripilus giganteusMERGI1
    Phaffia rhdozymaPHARA2Xanthophyllomyces dendrohous
    Phanerochaete chrysosporiumPHACH11Sporotrichum prunosum
    Schizophyllum communeSCHCO1Bracket Fungus
    Sporobolomyces singularisSPOSI1
    Trametes hirsutaTRAHI1
    Uromyces fabaeUROFA1Rust Fungus
Mucoromycotinia species
    Gongronella speciesGONSP1
    Mortierella alliaceaMORAL1
    Mucor circinelloidesMUCCI2Mucor griseo-roseus
    Mucor hiemalisMUCHI1
    Mucor javanicusMUCJA1
    Mycocladus corymbiferusMYCCO1Absidia corymbiferus
    Phycomyces nitensPHYNI1
    Rhizopus oligosporusRHIOL1
    Rhizopus oryzaeRHIOR17Rhizopus delemar
    Rhizopus speciesRHISP1
    Syncephalastrum racemosumSYNRA1
Neocallimastigomycota species
    Neocallimastix frontalisNEOFR1
    Neocallimastix patriciarumNEOPA4
    Orpinomyces joyoniiORPJO1
    Orpinomyces speciesORPSP3
    Piromyces equiPIREQ2
    Piromyces speciesPIRSP7

This table lists the species of fungi and the number of characterized glycoside hydrolases collected from each. They are listed according to phylum. The species name used in this research is listed on the left, while any other names used for the same species are listed under ‘Alternative Names’. The five-letter codes used for the standardized naming of genes are also listed here. The codes follow a naming system used by UniProt. The first three letters represent the genus and the last two letters represent the species of the fungus.

Table 3.

Fungal species having characterized glycoside hydrolases

CodeNumber of enzymes characterizedAlternate names
Ascomycota species
    Acremonium blochiiACRBL1
    Acrophialophora nainianaACRNA1
    Aphanocladium albumAPHAL1
    Arxula adeninivoransARXAD2
    Aspergillus aculeatusASPAC9
    Aspergillus awamoriASPAW11
    Aspergillus flavusASPFL2
    Aspergillus fumigatusASPFU5Sartorya fumigata
    Aspergillus kawachiiASPKA11Aspergillus awamori var. kawachii
    Aspergillus nigerASPNG47
    Aspergillus oryzaeASPOR16
    Aspergillus phoenicisASPPH1Aspergillus saitoi
    Aspergillus shirousamiASPSH2
    Aspergillus sojaeASPSO1
    Aspergillus speciesASPSP1
    Aspergillus sulphureusASPSU2
    Aspergillus terreusASPTE2
    Aspergillus tubingensisASPTU6
    Aureobasidium pullulansAURPU4
    Bionectria ochroleucaBIOOC4Gliocladium roseum
    Bispora sp. MEY-1BISSP1
    Botryotinia fuckelianaBOTFU7Botrytis cinerea, Noble-rot fungus
    Candida albicansCANAL6
    Candida oleophilaCANOL1
    Candida tsukubaensisCANTS1
    Candida wickerhamiiCANWI1
    Chaetomium brasilienseCHABR1
    Chaetomium gracileCHAGR2
    Chaetomium thermophilumCHATH1
    Claviceps purpureaCLAPU2
    Coccidioides immitisCOCIM2Valley Fever Fungus
    Cochliobolus carbonumCOCCA8Bipolaris zeicola
    Cochliobolus sativusCOCSA1Bipolaris sorokinia
    Cryphonectria parasiticaCRYPA1Endothia parasitica, Chestnut Blight Fungus
    Daldinia eschscholziiDALES1
    Debaryomyces occidentalisDEBOC3
    Emericella desertorumEMEDE1
    Emericella nidulansEMENI34Aspergillus nidulans
    Fusarium equisetiFUSEQ1Fusarium scirpi
    Fusarium oxysporumFUSOX2Panama Disease Fungus
    Fusarium solaniiFUSSO3Nectria ipomoeae
    Geotrichum speciesGEOSP2Fermentotrichon, Oosporoidea, Polymorphomyces
    Gibberella species 75GIBSP1
    Gibberella zeaeGIBZE3Fusarium graminearum, Wheat Head Blight Fungus
    Hansenula anomalaHANAN1Candida pelliculosa
    Hormoconis resinaeHORRE1Creosote fungus, Amorphotheca resinae
    Humicola grisea var. thermoideaHUMGT4
    Humicola insolensHUMIN6
    Hypocrea schweintziiHYPSC1
    Isaria javanicusISAJA1Paecilomyces javanicus
    Kluyveromyces lactisKLULA3Candida sphaerica
    Kluyveromyces marxianusKLUMA2Candida kefyr
    Kuraishia molischianaKURMO1Pichia capsulata
    Lipomyces konoenkoeLIPKO2
    Lipomyces starkeyiLIPST1Oleaginous yeast
    Magnaporthe griseaMAGGR5Pyricularia grisea, Rice Blast Fungus
    Melanocarpus albomycesMELAO3
    Metarhizium anisopliaeMETAN1
    Neotyphodium speciesNEOSP1
    Neurospora crassaNEUCR1
    Paecilomyces thermophilaPAETH1
    Penicillium brasilianumPENBR1
    Penicillium canescensPENCA1
    Penicillium chrysogenumPENCH3Penicillium notatum
    Penicillium citrinumPENCI3
    Penicillium enchinulatumPENEN1
    Penicillium funiculosumPENFN6
    Penicillium janthinellumPENJA2Penicillium vitale
    Penicillium minioluteumPENMI2
    Penicillium olsoniiPENOL2
    Penicillium purpurogenumPENPU6
    Penicillium simplicissimumPENSI1
    Penicillium speciesPENSQ8
    Periconia speciesPERSP1
    Pichia angustaPICAN2Hansenula polymorpha
    Pichia jadiniiPICJA1Candida utilis
    Pichia pastorisPICPA1
    Robillarda speciesROBSP1
    Saccharomyces cerevisiaeYEAST9Baker’s Yeast
    Saccharomycopsis fibuligeraSACFI3
    Schizosaccharomyces pombeSCHPO5
    Stachybotrys echinataSTAEC1
    Talaromyces emersoniiTALEM2
    Thermoascus aurantiacusTHEAU5
    Thermomyces lanuginosusTHELA2Humicola lanuginosa
    Thielavia heterothallicaTHIHE1
    Thielavia terrestrisTHITE1Acremonium alabamense
    Trichoderma asperellumTRIAS3
    Trichoderma harzianumTRIHA11Hypocrea lisii
    Trichoderma koningiiTRIKO2Hypocrea koningii
    Trichoderma longibrachiatumTRILO1
    Trichoderma reeseiTRIRE23Hypocrea jecorina
    Trichoderma speciesTRISP1
    Trichoderma virideTRIVI2
    Verticillium dahliaeVERDA1Verticillium Wilt Fungus
    Yarrowia lipolyticaYARLI1Candida lipolytica
Basidiomycota species
    Agaricus bisporusAGABI4Common Mushroom
    Athelia rolfsiiATHRO2Sclerotinia rolfsii, Corticium rolfsii
    Chondrostereum purpureumCHOPU1Stereum purpureum
    Coprinopsis cinereaCOPCI3Hormographiella aspergillata, Inky Cap Fungus
    Cryptococcus albidusCRYAL1Fiobasidium floriforma
    Cryptococcus flavusCRYFL1
    Cryptococcus speciesCRYSP2
    Fomitopsis palustrisFOMPA2
    Fomitopsis pinicolaFOMPI1
    Irpex lacteusIRPLA6Polyporus Tulipoferae, Milk-white Toothed Polypore
    Meripilus giganteusMERGI1
    Phaffia rhdozymaPHARA2Xanthophyllomyces dendrohous
    Phanerochaete chrysosporiumPHACH11Sporotrichum prunosum
    Schizophyllum communeSCHCO1Bracket Fungus
    Sporobolomyces singularisSPOSI1
    Trametes hirsutaTRAHI1
    Uromyces fabaeUROFA1Rust Fungus
Mucoromycotinia species
    Gongronella speciesGONSP1
    Mortierella alliaceaMORAL1
    Mucor circinelloidesMUCCI2Mucor griseo-roseus
    Mucor hiemalisMUCHI1
    Mucor javanicusMUCJA1
    Mycocladus corymbiferusMYCCO1Absidia corymbiferus
    Phycomyces nitensPHYNI1
    Rhizopus oligosporusRHIOL1
    Rhizopus oryzaeRHIOR17Rhizopus delemar
    Rhizopus speciesRHISP1
    Syncephalastrum racemosumSYNRA1
Neocallimastigomycota species
    Neocallimastix frontalisNEOFR1
    Neocallimastix patriciarumNEOPA4
    Orpinomyces joyoniiORPJO1
    Orpinomyces speciesORPSP3
    Piromyces equiPIREQ2
    Piromyces speciesPIRSP7
CodeNumber of enzymes characterizedAlternate names
Ascomycota species
    Acremonium blochiiACRBL1
    Acrophialophora nainianaACRNA1
    Aphanocladium albumAPHAL1
    Arxula adeninivoransARXAD2
    Aspergillus aculeatusASPAC9
    Aspergillus awamoriASPAW11
    Aspergillus flavusASPFL2
    Aspergillus fumigatusASPFU5Sartorya fumigata
    Aspergillus kawachiiASPKA11Aspergillus awamori var. kawachii
    Aspergillus nigerASPNG47
    Aspergillus oryzaeASPOR16
    Aspergillus phoenicisASPPH1Aspergillus saitoi
    Aspergillus shirousamiASPSH2
    Aspergillus sojaeASPSO1
    Aspergillus speciesASPSP1
    Aspergillus sulphureusASPSU2
    Aspergillus terreusASPTE2
    Aspergillus tubingensisASPTU6
    Aureobasidium pullulansAURPU4
    Bionectria ochroleucaBIOOC4Gliocladium roseum
    Bispora sp. MEY-1BISSP1
    Botryotinia fuckelianaBOTFU7Botrytis cinerea, Noble-rot fungus
    Candida albicansCANAL6
    Candida oleophilaCANOL1
    Candida tsukubaensisCANTS1
    Candida wickerhamiiCANWI1
    Chaetomium brasilienseCHABR1
    Chaetomium gracileCHAGR2
    Chaetomium thermophilumCHATH1
    Claviceps purpureaCLAPU2
    Coccidioides immitisCOCIM2Valley Fever Fungus
    Cochliobolus carbonumCOCCA8Bipolaris zeicola
    Cochliobolus sativusCOCSA1Bipolaris sorokinia
    Cryphonectria parasiticaCRYPA1Endothia parasitica, Chestnut Blight Fungus
    Daldinia eschscholziiDALES1
    Debaryomyces occidentalisDEBOC3
    Emericella desertorumEMEDE1
    Emericella nidulansEMENI34Aspergillus nidulans
    Fusarium equisetiFUSEQ1Fusarium scirpi
    Fusarium oxysporumFUSOX2Panama Disease Fungus
    Fusarium solaniiFUSSO3Nectria ipomoeae
    Geotrichum speciesGEOSP2Fermentotrichon, Oosporoidea, Polymorphomyces
    Gibberella species 75GIBSP1
    Gibberella zeaeGIBZE3Fusarium graminearum, Wheat Head Blight Fungus
    Hansenula anomalaHANAN1Candida pelliculosa
    Hormoconis resinaeHORRE1Creosote fungus, Amorphotheca resinae
    Humicola grisea var. thermoideaHUMGT4
    Humicola insolensHUMIN6
    Hypocrea schweintziiHYPSC1
    Isaria javanicusISAJA1Paecilomyces javanicus
    Kluyveromyces lactisKLULA3Candida sphaerica
    Kluyveromyces marxianusKLUMA2Candida kefyr
    Kuraishia molischianaKURMO1Pichia capsulata
    Lipomyces konoenkoeLIPKO2
    Lipomyces starkeyiLIPST1Oleaginous yeast
    Magnaporthe griseaMAGGR5Pyricularia grisea, Rice Blast Fungus
    Melanocarpus albomycesMELAO3
    Metarhizium anisopliaeMETAN1
    Neotyphodium speciesNEOSP1
    Neurospora crassaNEUCR1
    Paecilomyces thermophilaPAETH1
    Penicillium brasilianumPENBR1
    Penicillium canescensPENCA1
    Penicillium chrysogenumPENCH3Penicillium notatum
    Penicillium citrinumPENCI3
    Penicillium enchinulatumPENEN1
    Penicillium funiculosumPENFN6
    Penicillium janthinellumPENJA2Penicillium vitale
    Penicillium minioluteumPENMI2
    Penicillium olsoniiPENOL2
    Penicillium purpurogenumPENPU6
    Penicillium simplicissimumPENSI1
    Penicillium speciesPENSQ8
    Periconia speciesPERSP1
    Pichia angustaPICAN2Hansenula polymorpha
    Pichia jadiniiPICJA1Candida utilis
    Pichia pastorisPICPA1
    Robillarda speciesROBSP1
    Saccharomyces cerevisiaeYEAST9Baker’s Yeast
    Saccharomycopsis fibuligeraSACFI3
    Schizosaccharomyces pombeSCHPO5
    Stachybotrys echinataSTAEC1
    Talaromyces emersoniiTALEM2
    Thermoascus aurantiacusTHEAU5
    Thermomyces lanuginosusTHELA2Humicola lanuginosa
    Thielavia heterothallicaTHIHE1
    Thielavia terrestrisTHITE1Acremonium alabamense
    Trichoderma asperellumTRIAS3
    Trichoderma harzianumTRIHA11Hypocrea lisii
    Trichoderma koningiiTRIKO2Hypocrea koningii
    Trichoderma longibrachiatumTRILO1
    Trichoderma reeseiTRIRE23Hypocrea jecorina
    Trichoderma speciesTRISP1
    Trichoderma virideTRIVI2
    Verticillium dahliaeVERDA1Verticillium Wilt Fungus
    Yarrowia lipolyticaYARLI1Candida lipolytica
Basidiomycota species
    Agaricus bisporusAGABI4Common Mushroom
    Athelia rolfsiiATHRO2Sclerotinia rolfsii, Corticium rolfsii
    Chondrostereum purpureumCHOPU1Stereum purpureum
    Coprinopsis cinereaCOPCI3Hormographiella aspergillata, Inky Cap Fungus
    Cryptococcus albidusCRYAL1Fiobasidium floriforma
    Cryptococcus flavusCRYFL1
    Cryptococcus speciesCRYSP2
    Fomitopsis palustrisFOMPA2
    Fomitopsis pinicolaFOMPI1
    Irpex lacteusIRPLA6Polyporus Tulipoferae, Milk-white Toothed Polypore
    Meripilus giganteusMERGI1
    Phaffia rhdozymaPHARA2Xanthophyllomyces dendrohous
    Phanerochaete chrysosporiumPHACH11Sporotrichum prunosum
    Schizophyllum communeSCHCO1Bracket Fungus
    Sporobolomyces singularisSPOSI1
    Trametes hirsutaTRAHI1
    Uromyces fabaeUROFA1Rust Fungus
Mucoromycotinia species
    Gongronella speciesGONSP1
    Mortierella alliaceaMORAL1
    Mucor circinelloidesMUCCI2Mucor griseo-roseus
    Mucor hiemalisMUCHI1
    Mucor javanicusMUCJA1
    Mycocladus corymbiferusMYCCO1Absidia corymbiferus
    Phycomyces nitensPHYNI1
    Rhizopus oligosporusRHIOL1
    Rhizopus oryzaeRHIOR17Rhizopus delemar
    Rhizopus speciesRHISP1
    Syncephalastrum racemosumSYNRA1
Neocallimastigomycota species
    Neocallimastix frontalisNEOFR1
    Neocallimastix patriciarumNEOPA4
    Orpinomyces joyoniiORPJO1
    Orpinomyces speciesORPSP3
    Piromyces equiPIREQ2
    Piromyces speciesPIRSP7

This table lists the species of fungi and the number of characterized glycoside hydrolases collected from each. They are listed according to phylum. The species name used in this research is listed on the left, while any other names used for the same species are listed under ‘Alternative Names’. The five-letter codes used for the standardized naming of genes are also listed here. The codes follow a naming system used by UniProt. The first three letters represent the genus and the last two letters represent the species of the fungus.

Results

Using the procedures described in the ‘Methods’ section, we have collected a total of 453 characterized GH enzymes of fungal origin. They come from 131 different fungal species (Table 3), most of which are from the phylum Ascomycota. The genus Aspergillus encompasses the largest number of characterized GH proteins with Aspergillus niger in the lead accounting for 47 enzymes. The collected enzymes represent 49 different GH activities and cover 44 of the GH families described in CAZy (1–5) (Table 4). All of these enzymes were extracellular, with 443 enzymes as soluble extracellular proteins and only 6 that were shown to attach to the external cell wall.

Table 4.

GH families having characterized enzymes of fungal origin

FamilyNumber of Enzymes CharacterizedActivity
GH17β-glucosidase (7)
GH25β-mannosidase (2), chitosanase (1), exo-glucosaminidase (1), β-galactosidase (1)
GH330β-glucosidase (22), β-xylosidase (8)
GH545Endoglucanase (22), exo-1,3-β-glucanase (12), β-mannanase (8), galactanase (2), endo-1,6-β-glucanase (1)
GH612Cellobiohydrolase (11), endoglucanase (1)
GH729Cellobiohydrolase (18), endoglucanase (10), xylanase (1)
GH91Endoglucanase (1)
GH1019Xylanase (19)
GH1144Xylanase (44)
GH1224Endoglucanase (20), xyloglucanase (3), licheninase (1)
GH1310α-glucosidase (3), α-amylase (6), oligo-1,6-glucosidase (1)
GH1514Glucoamylase (14)
GH165Mixed-link glucanase (3), laminarinase (1), licheninase (1)
GH172Laminarinase (1), exo-1,3-β-glucanase (1)
GH1813Chitinase (13)
GH206Hexosaminidase (6)
GH263β-mannanase (3)
GH276α-galactosidase (6)
GH2854Endo-polygalacturonase (40), exo-polygalacturonase (9), endo-rhamnogalacturonase (3), exo-rhamnogalacturonase (1), xylogalacturonase (1)
GH303Endo-1,6-β-glucanase (3)
GH3110α-glucosidase (8), α-xylosidase (1), invertase (1)
GH3222Invertase (10), exo-inulinase (7), endo-inulinase (4)
GH351β-galactosidase (1)
GH367α-galactosidase (7)
GH374Trehalase (4)
GH436Endo-1,5-α-arabinanase (3), α-l-arabinofuranosidase (2), β-xylosidase (1)
GH458Endoglucanase (8)
GH475α-1,2-mannosidase (5)
GH494Dextranase (3), isopullulanase (1)
GH515α-l-arabinofuranosidase (5)
GH536Arabinogalactanase (6)
GH549α-l-arabinofuranosidase (9)
GH554Exo-1,3-β-glucanase (3), laminarinase (1)
GH613Cellulase-enhancing protein (3)
GH622Arabinoxylan arabinofuranosidase (2)
GH652Trehalase (2)
GH674α-glucuronidase (4)
GH713Mutanase (3)
GH746Xyloglucanase (3), oligoxyloglucan cellobiohydrolase (2), endoglucanase (1)
GH752Chitosanase (2)
GH783α-rhamnosidase (3)
GH812Laminarinase (2)
GH851N-acetylglucosaminidase (1)
GH932Exo-arabinanase (2)
Total453
FamilyNumber of Enzymes CharacterizedActivity
GH17β-glucosidase (7)
GH25β-mannosidase (2), chitosanase (1), exo-glucosaminidase (1), β-galactosidase (1)
GH330β-glucosidase (22), β-xylosidase (8)
GH545Endoglucanase (22), exo-1,3-β-glucanase (12), β-mannanase (8), galactanase (2), endo-1,6-β-glucanase (1)
GH612Cellobiohydrolase (11), endoglucanase (1)
GH729Cellobiohydrolase (18), endoglucanase (10), xylanase (1)
GH91Endoglucanase (1)
GH1019Xylanase (19)
GH1144Xylanase (44)
GH1224Endoglucanase (20), xyloglucanase (3), licheninase (1)
GH1310α-glucosidase (3), α-amylase (6), oligo-1,6-glucosidase (1)
GH1514Glucoamylase (14)
GH165Mixed-link glucanase (3), laminarinase (1), licheninase (1)
GH172Laminarinase (1), exo-1,3-β-glucanase (1)
GH1813Chitinase (13)
GH206Hexosaminidase (6)
GH263β-mannanase (3)
GH276α-galactosidase (6)
GH2854Endo-polygalacturonase (40), exo-polygalacturonase (9), endo-rhamnogalacturonase (3), exo-rhamnogalacturonase (1), xylogalacturonase (1)
GH303Endo-1,6-β-glucanase (3)
GH3110α-glucosidase (8), α-xylosidase (1), invertase (1)
GH3222Invertase (10), exo-inulinase (7), endo-inulinase (4)
GH351β-galactosidase (1)
GH367α-galactosidase (7)
GH374Trehalase (4)
GH436Endo-1,5-α-arabinanase (3), α-l-arabinofuranosidase (2), β-xylosidase (1)
GH458Endoglucanase (8)
GH475α-1,2-mannosidase (5)
GH494Dextranase (3), isopullulanase (1)
GH515α-l-arabinofuranosidase (5)
GH536Arabinogalactanase (6)
GH549α-l-arabinofuranosidase (9)
GH554Exo-1,3-β-glucanase (3), laminarinase (1)
GH613Cellulase-enhancing protein (3)
GH622Arabinoxylan arabinofuranosidase (2)
GH652Trehalase (2)
GH674α-glucuronidase (4)
GH713Mutanase (3)
GH746Xyloglucanase (3), oligoxyloglucan cellobiohydrolase (2), endoglucanase (1)
GH752Chitosanase (2)
GH783α-rhamnosidase (3)
GH812Laminarinase (2)
GH851N-acetylglucosaminidase (1)
GH932Exo-arabinanase (2)
Total453

The GH families and the total number of biochemically characterized enzymes collected for each are listed here. The GH families that did not have any characterized fungal enzymes were not included. The column titled ‘Activity’ shows the different types of activities the collected enzymes from each GH have. The numbers in brackets show the distribution of activity types in each family.

Table 4.

GH families having characterized enzymes of fungal origin

FamilyNumber of Enzymes CharacterizedActivity
GH17β-glucosidase (7)
GH25β-mannosidase (2), chitosanase (1), exo-glucosaminidase (1), β-galactosidase (1)
GH330β-glucosidase (22), β-xylosidase (8)
GH545Endoglucanase (22), exo-1,3-β-glucanase (12), β-mannanase (8), galactanase (2), endo-1,6-β-glucanase (1)
GH612Cellobiohydrolase (11), endoglucanase (1)
GH729Cellobiohydrolase (18), endoglucanase (10), xylanase (1)
GH91Endoglucanase (1)
GH1019Xylanase (19)
GH1144Xylanase (44)
GH1224Endoglucanase (20), xyloglucanase (3), licheninase (1)
GH1310α-glucosidase (3), α-amylase (6), oligo-1,6-glucosidase (1)
GH1514Glucoamylase (14)
GH165Mixed-link glucanase (3), laminarinase (1), licheninase (1)
GH172Laminarinase (1), exo-1,3-β-glucanase (1)
GH1813Chitinase (13)
GH206Hexosaminidase (6)
GH263β-mannanase (3)
GH276α-galactosidase (6)
GH2854Endo-polygalacturonase (40), exo-polygalacturonase (9), endo-rhamnogalacturonase (3), exo-rhamnogalacturonase (1), xylogalacturonase (1)
GH303Endo-1,6-β-glucanase (3)
GH3110α-glucosidase (8), α-xylosidase (1), invertase (1)
GH3222Invertase (10), exo-inulinase (7), endo-inulinase (4)
GH351β-galactosidase (1)
GH367α-galactosidase (7)
GH374Trehalase (4)
GH436Endo-1,5-α-arabinanase (3), α-l-arabinofuranosidase (2), β-xylosidase (1)
GH458Endoglucanase (8)
GH475α-1,2-mannosidase (5)
GH494Dextranase (3), isopullulanase (1)
GH515α-l-arabinofuranosidase (5)
GH536Arabinogalactanase (6)
GH549α-l-arabinofuranosidase (9)
GH554Exo-1,3-β-glucanase (3), laminarinase (1)
GH613Cellulase-enhancing protein (3)
GH622Arabinoxylan arabinofuranosidase (2)
GH652Trehalase (2)
GH674α-glucuronidase (4)
GH713Mutanase (3)
GH746Xyloglucanase (3), oligoxyloglucan cellobiohydrolase (2), endoglucanase (1)
GH752Chitosanase (2)
GH783α-rhamnosidase (3)
GH812Laminarinase (2)
GH851N-acetylglucosaminidase (1)
GH932Exo-arabinanase (2)
Total453
FamilyNumber of Enzymes CharacterizedActivity
GH17β-glucosidase (7)
GH25β-mannosidase (2), chitosanase (1), exo-glucosaminidase (1), β-galactosidase (1)
GH330β-glucosidase (22), β-xylosidase (8)
GH545Endoglucanase (22), exo-1,3-β-glucanase (12), β-mannanase (8), galactanase (2), endo-1,6-β-glucanase (1)
GH612Cellobiohydrolase (11), endoglucanase (1)
GH729Cellobiohydrolase (18), endoglucanase (10), xylanase (1)
GH91Endoglucanase (1)
GH1019Xylanase (19)
GH1144Xylanase (44)
GH1224Endoglucanase (20), xyloglucanase (3), licheninase (1)
GH1310α-glucosidase (3), α-amylase (6), oligo-1,6-glucosidase (1)
GH1514Glucoamylase (14)
GH165Mixed-link glucanase (3), laminarinase (1), licheninase (1)
GH172Laminarinase (1), exo-1,3-β-glucanase (1)
GH1813Chitinase (13)
GH206Hexosaminidase (6)
GH263β-mannanase (3)
GH276α-galactosidase (6)
GH2854Endo-polygalacturonase (40), exo-polygalacturonase (9), endo-rhamnogalacturonase (3), exo-rhamnogalacturonase (1), xylogalacturonase (1)
GH303Endo-1,6-β-glucanase (3)
GH3110α-glucosidase (8), α-xylosidase (1), invertase (1)
GH3222Invertase (10), exo-inulinase (7), endo-inulinase (4)
GH351β-galactosidase (1)
GH367α-galactosidase (7)
GH374Trehalase (4)
GH436Endo-1,5-α-arabinanase (3), α-l-arabinofuranosidase (2), β-xylosidase (1)
GH458Endoglucanase (8)
GH475α-1,2-mannosidase (5)
GH494Dextranase (3), isopullulanase (1)
GH515α-l-arabinofuranosidase (5)
GH536Arabinogalactanase (6)
GH549α-l-arabinofuranosidase (9)
GH554Exo-1,3-β-glucanase (3), laminarinase (1)
GH613Cellulase-enhancing protein (3)
GH622Arabinoxylan arabinofuranosidase (2)
GH652Trehalase (2)
GH674α-glucuronidase (4)
GH713Mutanase (3)
GH746Xyloglucanase (3), oligoxyloglucan cellobiohydrolase (2), endoglucanase (1)
GH752Chitosanase (2)
GH783α-rhamnosidase (3)
GH812Laminarinase (2)
GH851N-acetylglucosaminidase (1)
GH932Exo-arabinanase (2)
Total453

The GH families and the total number of biochemically characterized enzymes collected for each are listed here. The GH families that did not have any characterized fungal enzymes were not included. The column titled ‘Activity’ shows the different types of activities the collected enzymes from each GH have. The numbers in brackets show the distribution of activity types in each family.

Table 5.

Some properties of characterized fungal cellulases

ActivityGH familyTotalOptimal pHOptimal temperature (°C)Temperature stability (°C)Mass (kDa)
β-glucosidaseGH173.5–6.340–5540–5552–94
GH3223.5–8.037–7230–7074–145
CellobiohydrolaseGH6114.8–9.040–5030–6040–60
GH7185.0–6.035–6550–5547–90
EndoglucanaseGH5223.5–8.540–7537–8035–56
GH615.5NANA38
GH7104.0–5.5452–5740–5046–56
GH91NANANA90
GH12202.0–5.055–7040–5525–32
GH4585.0–7.030–656020–47
GH7414.5553090
Cellulase-enhancing proteinsGH6134.0–6.0NANA36–56
ActivityGH familyTotalOptimal pHOptimal temperature (°C)Temperature stability (°C)Mass (kDa)
β-glucosidaseGH173.5–6.340–5540–5552–94
GH3223.5–8.037–7230–7074–145
CellobiohydrolaseGH6114.8–9.040–5030–6040–60
GH7185.0–6.035–6550–5547–90
EndoglucanaseGH5223.5–8.540–7537–8035–56
GH615.5NANA38
GH7104.0–5.5452–5740–5046–56
GH91NANANA90
GH12202.0–5.055–7040–5525–32
GH4585.0–7.030–656020–47
GH7414.5553090
Cellulase-enhancing proteinsGH6134.0–6.0NANA36–56

The three different types of cellulase activities; β-glucosidase, endoglucanase and cellobiohydrolase are listed here. Cellulose-enhancing proteins have been included as well. A range of the biochemical properties for each activity type is presented according to GH family. ‘NA’ indicates that the information was not available. The summarized data presented in the table were compiled from this study.

Table 5.

Some properties of characterized fungal cellulases

ActivityGH familyTotalOptimal pHOptimal temperature (°C)Temperature stability (°C)Mass (kDa)
β-glucosidaseGH173.5–6.340–5540–5552–94
GH3223.5–8.037–7230–7074–145
CellobiohydrolaseGH6114.8–9.040–5030–6040–60
GH7185.0–6.035–6550–5547–90
EndoglucanaseGH5223.5–8.540–7537–8035–56
GH615.5NANA38
GH7104.0–5.5452–5740–5046–56
GH91NANANA90
GH12202.0–5.055–7040–5525–32
GH4585.0–7.030–656020–47
GH7414.5553090
Cellulase-enhancing proteinsGH6134.0–6.0NANA36–56
ActivityGH familyTotalOptimal pHOptimal temperature (°C)Temperature stability (°C)Mass (kDa)
β-glucosidaseGH173.5–6.340–5540–5552–94
GH3223.5–8.037–7230–7074–145
CellobiohydrolaseGH6114.8–9.040–5030–6040–60
GH7185.0–6.035–6550–5547–90
EndoglucanaseGH5223.5–8.540–7537–8035–56
GH615.5NANA38
GH7104.0–5.5452–5740–5046–56
GH91NANANA90
GH12202.0–5.055–7040–5525–32
GH4585.0–7.030–656020–47
GH7414.5553090
Cellulase-enhancing proteinsGH6134.0–6.0NANA36–56

The three different types of cellulase activities; β-glucosidase, endoglucanase and cellobiohydrolase are listed here. Cellulose-enhancing proteins have been included as well. A range of the biochemical properties for each activity type is presented according to GH family. ‘NA’ indicates that the information was not available. The summarized data presented in the table were compiled from this study.

Distribution of enzyme activities

Cellulases comprise ∼27% of the characterized GH in this database. Collectively, they cover nine different GH families. The 63 endoglucanases represent activity type that has the most published characterization data. The majority belongs to GH5. The cellobiohydrolases come from GH6 and GH7 with 11 and 18 proteins, respectively. The β-glucosidases also cover two families with 7 from GH1 and 22 from GH3.

With 64 entries, xylanases have the highest number characterized of any activity type. The xylanases come from GH families 10 and 11 for the most part. A single xylanase, XYN7A_PENFN, contains a GH7 domain (31,32), which is unusual. No other literature in the collection described a xylanase from GH7. Other xylan-active enzymes collected included β-xylosidases, arabinoxylan arabinofuranosidases and α-glucuronidases.

Several different types of arabinan-active enzymes are represented in the collection. There are three endo-arabinanases and two exo-arabinanases, belonging to GH43 and GH93, respectively. We also collected 17 enzymes having α-arabinofuranosidase activity. These enzymes act on arabinans as well as arabinosyl side chains attached to other polysaccharides. They are divided among GH families 43, 51 and 54.

The mannan-active enzymes collected in the greatest number are the β-mannanases. There are 11 enzymes with 8 from GH5 and 3 from GH26. Some showed high thermostability having optimal temperatures as high as 79°C. β-Mannosidases also act on the main chain of mannans. Two β-mannosidases, both from GH2, were collected.

Mannans may have other sugars such as glucose incorporated into the backbone or galactose present in side chains. Enzymes with different specificities are required to hydrolyze these residues. One example of these types of enzymes is α-galactosidase, which is represented in the database by six enzymes from GH27 and seven from GH36.

Pectin, another common lignocellulose polymer, can occur in a variety of forms. It is mostly composed of galacturonic acids, which can alternate with rhamnogalactuonans in the main chain, or have branches composed of a variety of different residues. All of the glycoside hydrolases active on the main chains of pectins are from GH 28. There are 54 in total with 40 having endo-polygalacturonase activity, 9 having exo-polygalacturonase activity, 3 having endo-rhamnogalacturonase activity and 1 each of exo-rhamnogalacturonase and xylogalacturonase.

Chitin, inulin and starch are also components of biomass. The two major enzymes active on chitin are chitinases and chitosanases. Characterized chitinases are the more numerous of the 2 with 13 in the database, all of which are from GH18. Three chitosanases were collected; one from GH2 and two from GH75. The characterized inulin-active enzymes mostly come from GH32 except for one invertase with a GH31 domain. This invertase along with the GH32 invertases and endo- and exo-inulinases add up to a total of 21 inulin-active enzymes. There is a greater variety of glycoside hydrolases active on starch compared to chitin and inulin. Some of these include α-amylase, glucoamylase, oligo-1,6-glucosidase, dextranase and trehalase.

A variety of enzymes active on non-cellulosic β-glucans were collected as well. The most abundant are the 16 exo-1,3-β-glucanases from GH families 5, 17 and 55.

We have cataloged a limited number of characterized enzymes for other GH activity types including α-rhamnosidase, oligoxyloglucan cellobiohydrolase, mutanase, β-galactosidase, α-1,2-mannosidase, endo-N-acetylglucosaminidase and galactanase to name a few. For more details and properties of the individual enzymes, and the data and literature on these 453 characterized GH, see mycoCLAP <http://mycoCLAP.fungalgenomics.ca/>.

The mycoCLAP database

The mycoCLAP collection of fungal enzymes is searchable by BLAST alignment using a query sequence or by keywords. A BLAST search will display entries most similar to the query while a keyword search displays results in a tabular format (Figure 1). The results table lists enzymes by their unique entry name followed by the corresponding data. Results can be filtered by selecting or deselecting specific entries in the search page. A browsing option is also possible by leaving the keyword fields blank and clicking on the ‘Search’ button. Selecting an entry name leads to the gene page containing all the data, sequences and literature related to the enzyme (Figure 2).

Figure 1.

mycoCLAP search page. The main search page from mycoCLAP is shown here. Keywords such as enzyme activity, glycoside hydrolase family or a substrate name can be entered as search terms. Leaving the field blank and clicking on ‘search’ allows a user to browse the database. The information recorded during the curation process has been divided into six categories shown here; Enzyme Name, Biochemical Properties, Annotation, External Resources, Protein Features and Sequence. By checking boxes under these categories a user can determine which types of information will be displayed on the results page. The default settings are shown here. The tabs along the top allow quick and easy navigation through all of mycoCLAP’s features.

Figure 2.

Gene page example. This screen shot illustrates the set-up and types of data available on mycoCLAP. This is part of the gene page for the glycoside hydrolase XYN10A_ASPNG (a family 10 xylanase from Aspergillus niger). The ‘Names and Origin’ section includes any names or abbreviations used to identify the enzyme in the literature or on other databases. A search by any of those names will deliver this enzyme as a hit. The next section contains biochemical properties extracted from the literature. This entry has the enzyme’s specific activity, pH optima and temperature optima on birch wood xylan when expressed from two different hosts. Other information recorded on gene pages includes nucleotide and amino acid sequences, protein domains, assay conditions, enzyme family, literature citations and other features recorded in the literature that make the entry unique.

The download option allows a user to download data text and/or sequences in fasta format. The types of data to be downloaded are selected in the same way as the keyword search. Individual enzymes, or a subset of them, can be selected for download by using the check boxes on the left of the results table. mycoCLAP also provides a list of resources for various annotation tools and ongoing sequencing projects. They are listed under the tab ‘Useful Links’ along with a short description.

Users are encouraged to add new entries and make corrections to existing entries using the ‘New Entry’ and ‘Correction’ forms. A curator will review each submission before any changes or additions are made to the database.

The database CAZy is a well-maintained and comprehensive resource for carbohydrate-active enzymes while mycoCLAP in its current stage of development with the focus on fungal glycoside hydrolases is far less comprehensive. The major difference between these two databases is that mycoCLAP contains only sequences whose products have been biochemically characterized, whereas CAZy includes sequences with predicted function. mycoCLAP also provides a BLAST resource, an important tool in the annotation of novel sequences, and this feature is not available in CAZy. Using BLAST to search only the characterized enzyme database allows the closest related sequence with experimentally documented characteristics to be rapidly located. It also provides relevant structural and biochemical data extracted from published literature in an easy-to-view format. It should be noted that mycoCLAP is focused on natural diversity and does not yet include engineered or evolved variants of naturally occuring fungal enzymes.

Conclusion

Characterized glycoside hydrolases were identified from literature obtained through BRENDA, PubMed, Google Scholar and myNCBI. Their properties were collected in a spreadsheet and the corresponding gene and protein sequences were collected from GenBank and UniProt. Standardized functional annotations from The GO and The EC were assigned based on findings from the literature. The collected data and assigned annotations were then deposited in mycoCLAP.

The mycoCLAP database is intended to facilitate the annotation of glycoside hydrolases active in the decomposition of plant biomass by providing a mechanism for comparison of novel sequences to a set of sequences whose gene products have been characterized. Such comparisons should result in decreased occurrence of false positives in searching for homologs, shortened times for the sorting process and expedition of the identification of targets to guide experimental analysis.

The curation of characterized GH data is an ongoing project that will be continually updated and expanded. Currently, characterized carbohydrate esterases, polysaccharide lyases and lipases involved in biomass degradation are being curated for incorporation into mycoCLAP. Future work will include the collection of other characterized lignocellulose-active enzymes such as peroxidases, cellobiose dehydrogenases, proteases as well as engineered versions of these characterized enzymes. Using the methods outlined, we believe that we have exhausted the literature describing fungal glycoside hydrolysis. However, there is no way to be sure that all the relevant literature has been collected. With the efficient breakdown of biomass for the production of biofuels and bioproducts becoming more and more important, new information will constantly become available. With the continuing contribution from our curators and the help of submissions from other researchers in the field, the database will be regularly updated and thus provide the fungal research community with the latest and most comprehensive collection of knowledge and data.

While mycoCLAP offers a detailed, searchable database that can be used to survey and rapidly locate information on characterized, sequenced, lignocellulose-active enzymes, it is important to recognize the limitations it has as a comparative tool. Some parameters such as temperature and pH optima for different enzymes can, with some care, be compared with one another. Others, such as Vmax and Km, which are dependent on parameters such as temperature and pH, were not always performed under optimal conditions. For these parameters, the user should refer to the original articles for additional details. Finally, it is not often appreciated that reducing sugar assays can give quite different results depending on the method used (33), as can protein assays. Hence, it is very difficult to compare the specific activities of different enzyme preparations unless the same activity and protein assays were used.

Acknowledgement

The authors would like to thank Vineet Dua for his contribution to editing the collected data.

Funding

Cellulosic Biofuel Network of the Agricultural Bioproducts Innovation Program of Agriculture and Agri-Food Canada; Genome Canada and Génome Québec. Funding for open access charge: Genome Canada.

Conflict of interest. None declared.

References

1
Henrissat
B
Davies
G
Structural and sequence-based classification of glycoside hydrolases
Curr. Opin. Struct. Biol.
1997
, vol. 
7
 (pg. 
637
-
644
)
2
Davies
G
Henrissat
B
Structures and mechanisms of glycosyl hydrolases
Structure
1995
, vol. 
3
 (pg. 
853
-
859
)
3
Henrissat
B
Bairoch
A
Updating the sequence-based classification of glycosyl hydrolases
Biochem. J.
1996
, vol. 
316
 
Pt 2
(pg. 
695
-
696
)
4
Henrissat
B
Bairoch
A
New families in the classification of glycosyl hydrolases based on amino acid sequence similarities
Biochem. J.
1993
, vol. 
293
 
Pt 3
(pg. 
781
-
788
)
5
Henrissat
B
A classification of glycosyl hydrolases based on amino acid sequence similarities
Biochem. J.
1991
, vol. 
280
 
Pt 2
(pg. 
309
-
316
)
6
Lundell
TK
Makela
MR
Hilden
K
Lignin-modifying enzymes in filamentous basidiomycetes–ecological, functional and phylogenetic review
J. Basic Microbiol.
2010
, vol. 
50
 (pg. 
5
-
20
)
7
Sanchez
C
Lignocellulosic residues: biodegradation and bioconversion by fungi
Biotechnol. Adv.
2009
, vol. 
27
 (pg. 
185
-
194
)
8
Coleman
JJ
Rounsley
SD
Rodriguez-Carres
M
, et al. 
The genome of Nectria haematococca: contribution of supernumerary chromosomes to gene expansion
PLoS Genet.
2009
, vol. 
5
 pg. 
e1000618
 
9
Ellwood
SR
Liu
Z
Syme
RA
, et al. 
A first genome assembly of the barley fungal pathogen Pyrenophora teres f. teres
Genome Biol.
2010
, vol. 
11
 pg. 
R109
 
10
Magrini
V
Warren
WC
Wallis
J
, et al. 
Fosmid-based physical mapping of the Histoplasma capsulatum genome
Genome Res.
2004
, vol. 
14
 (pg. 
1603
-
1609
)
11
Martin
F
Aerts
A
Ahren
D
, et al. 
The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis
Nature
2008
, vol. 
452
 (pg. 
88
-
92
)
12
Martinez
D
Berka
RM
Henrissat
B
, et al. 
Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina)
Nat. Biotechnol.
2008
, vol. 
26
 (pg. 
553
-
560
)
13
Martinez
D
Challacombe
J
Morgenstern
I
, et al. 
Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion
Proc. Natl Acad. Sci. USA
2009
, vol. 
106
 (pg. 
1954
-
1959
)
14
Martinez
D
Larrondo
LF
Putnam
N
, et al. 
Genome sequence of the lignocellulose degrading fungus Phanerochaete chrysosporium strain RP78
Nat. Biotechnol.
2004
, vol. 
22
 (pg. 
695
-
700
)
15
Nierman
WC
Pain
A
Anderson
MJ
, et al. 
Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus
Nature
2005
, vol. 
438
 (pg. 
1151
-
1156
)
16
Dean
RA
Talbot
NJ
Ebbole
DJ
, et al. 
The genome sequence of the rice blast fungus Magnaporthe grisea
Nature
2005
, vol. 
434
 (pg. 
980
-
986
)
17
Galagan
JE
Calvo
SE
Borkovich
KA
, et al. 
The genome sequence of the filamentous fungus Neurospora crassa
Nature
2003
, vol. 
422
 (pg. 
859
-
868
)
18
Espagne
E
Lespinet
O
Malagnac
F
, et al. 
The genome sequence of the model ascomycete fungus Podospora anserina
Genome Biol.
2008
, vol. 
9
 pg. 
R77
 
19
Kamper
J
Kahmann
R
Bolker
M
, et al. 
Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis
Nature
2006
, vol. 
444
 (pg. 
97
-
101
)
20
Chang
A
Scheer
M
Grote
A
, et al. 
BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D588
-
D592
)
21
Benson
DA
Karsch-Mizrachi
I
Lipman
DJ
, et al. 
GenBank
Nucleic Acids Res.
2008
, vol. 
36
 (pg. 
D25
-
D30
)
22
Consortium
U
The Universal Protein Resource (UniProt) 2009
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D169
-
D174
)
23
Consortium
TGO
Gene Ontology: tool for the unification of biology
Nature Genet.
2000
, vol. 
25
 (pg. 
25
-
29
)
24
Hessing
JG
van Rotterdam
C
Verbakel
JM
, et al. 
Isolation and characterization of a 1,4-beta-endoxylanase gene of A. awamori
Curr. Genet.
1994
, vol. 
26
 (pg. 
228
-
232
)
25
Giesbert
S
Lepping
HB
Tenberge
KB
, et al. 
The xylanolytic system of claviceps purpurea: cytological evidence for secretion of xylanases in infected rye tissue and molecular characterization of two xylanase genes
Phytopathology
1998
, vol. 
88
 (pg. 
1020
-
1030
)
26
de Graaff
LH
van den Broeck
HC
van Ooijen
AJ
, et al. 
Regulation of the xylanase-encoding xlnA gene of Aspergillus tubigensis
Mol. Microbiol.
1994
, vol. 
12
 (pg. 
479
-
490
)
27
Steenbakkers
PJ
Ubhayasekera
W
Goossen
HJ
, et al. 
An intron-containing glycoside hydrolase family 9 cellulase gene encodes the dominant 90kDa component of the cellulosome of the anaerobic fungus Piromyces sp. strain E2
Biochem. J.
2002
, vol. 
365
 
Pt 1
(pg. 
193
-
204
)
28
Desmet
T
Cantaert
T
Gualfetti
P
, et al. 
An investigation of the substrate specificity of the xyloglucanase Cel74A from Hypocrea jecorina
FEBS J.
2007
, vol. 
274
 (pg. 
356
-
363
)
29
Murray
P
Aro
N
Collins
C
, et al. 
Expression in Trichoderma reesei and characterisation of a thermostable family 3 beta-glucosidase from the moderately thermophilic fungus Talaromyces emersonii
Protein Expr. Purif.
2004
, vol. 
38
 (pg. 
248
-
257
)
30
Ohnishi
Y
Nagase
M
Ichiyanagi
T
, et al. 
Transcriptional regulation of two cellobiohydrolase encoding genes (cel1 and cel2) from the wood-degrading basidiomycete Polyporus arcularius
Appl. Microbiol. Biotechnol.
2007
, vol. 
76
 (pg. 
1069
-
1078
)
31
Alcocer
MJ
Furniss
CS
Kroon
PA
, et al. 
Comparison of modular and non-modular xylanases as carrier proteins for the efficient secretion of heterologous proteins from Penicillium funiculosum
Appl. Microbiol. Biotechnol.
2003
, vol. 
60
 (pg. 
726
-
732
)
32
Furniss
CSM
Williamson
G
Kroon
PA
The substrate specificity and susceptibility to wheat inhibitor proteins of Penicillium funiculosum xylanase from a commercial enzyme preparation
J. Sci. Food Agriculture
2005
, vol. 
85
 (pg. 
574
-
582
)
33
Kongruang
S
Han
MJ
Breton
CI
, et al. 
Quantitative analysis of cellulose-reducing ends
Appl. Biochem. Biotechnol.
2004
, vol. 
113–116
 (pg. 
213
-
231
)
This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.