SDADB: a functional annotation database of protein structural domains

Author Notes

Abstract

Annotating functional terms with individual domains is essential for understanding the functions of full-length proteins. We describe SDADB, a functional annotation database for structural domains. SDADB provides associations between gene ontology (GO) terms and SCOP domains calculated with an integrated framework. GO annotations are assigned probabilities of being correct, which are estimated with a Bayesian network by taking advantage of structural neighborhood mappings, SCOP-InterPro domain mapping information, position-specific scoring matrices (PSSMs) and sequence homolog features, with the most substantial contribution coming from high-coverage structure-based domain-protein mappings. The domain-protein mappings are computed using large-scale structure alignment. SDADB contains ontological terms with probabilistic scores for more than 214 000 distinct SCOP domains. It also provides additional features include 3D structure alignment visualization, GO hierarchical tree view, search, browse and download options.

Database URL: http://sda.denglab.org

Introduction

A protein domain is a conserved and functional unit of a protein that can fold independently and has distinct functions. Most proteins consist of one or several domains. A unique domain may appear in a variety of different proteins that capture specific functions. Usually, specific functions of protein domains are highly independent, and they are, in many cases, conserved across species (1). For example, the catalytic domain of serine/threonine/tyrosine protein kinases is highly conserved from E. coli to human containing the catalytic function, and shares conserved catalytic regions with both serine/threonine and tyrosine protein kinases (2). The N-terminal of the catalytic domain has been shown to be involved in ATP binding, while the central part of the catalytic domain plays important roles in the catalytic activity of the enzyme (3, 4). A broad range of approaches has been developed to the problem of automatically identifying domain regions in protein sequences based on some degree of relatedness shared between domain sequences. InterPro (5, 6) is the widely used sequence-based domain database, which collates important resources for protein domain classifications: Pfam (7), CATH-Gene3D (8), SMART (9), ProDom (10), SUPERFAMILY (11) and PROSITE (12). The Conserved Domain Database (CDD) (13) maintains domain annotations for sequences. It produces representative sequence fragments, which are in agreement with domain boundaries as observed in protein 3D structure. A more reliable way to assign structures to the domain families is using the structural information. As the widely used hierarchical classification scheme of proteins, SCOP (14) groups protein domains into Class, fold, superfamily and family according to structural and evolutionary relationships (15). The current version of SCOPe version 2.06 (16, 17) contains over 240 000 structural domains.

Assigning ontological terms to specific domains are important for fully understanding functions of proteins. Gene ontology (GO) has been a de facto standard for describing gene and protein function (18, 19). It arranges in a directed acyclic graph and discriminates between molecular function and biological process, as well as subcellular localization. The GO terms in top levels describe general functions such as catalytic activity and binding. While deeper GO terms in the hierarchy represent more specific functions. For sequence-based domains, a few have been manually annotated with GO terms, and several computational prediction methods have been developed. The InterPro2GO mapping (20) is curated manually by the InterPro team, who compare InterPro and protein entries, check the statistic and conservation information, and assign most appropriate and specific GO terms to the InterPro domain. The Pfam2GO mapping is subsequently created by mapping InterPro domains to Pfam domains. Forslund and Sonnhammer (21) developed a probabilistic model to predict the relationship between multiple Pfam domain and annotation GO terms. Rentzsch and Orengo (22) use domain functional families (FunFams) to predict the functions of whole proteins. They group domain sequences into FunFams based on the GO annotations and associate the FunFams with GO terms probabilistically.

Although sequence-based domain annotation and domain-centric protein function prediction have been extensively studied, predicting functions for protein structural domains is, even more, difficult given the lack of comprehensive structural domain information for proteins. Only a few previous efforts have been performed to computationally predict structural domain functions (11, 23, 24). The SUPERFAMILY database (11) contains SCOP domain architecture and classification assignments to sequences at the superfamily level by using hidden Markov models. Based on the sequence homology to SCOP structural domain mapping in SUPERFAMILY, the dcGO database (24) provides GO annotations for SCOP domains in a probabilistic framework at the superfamily and family levels. Daniel and Florencio (23) proposed a scop2go approach, which annotates SCOP domains with molecular function GO terms based on the fold distribution of PDB structures associated with given GO terms. Although these resources are valuable, they only have coarse-grained level function annotation and are largely incomplete in that many domains are still not annotated.

Recently, we proposed a functional annotation approach for structural domains (SDA) that is largely based on 3D structure-based domain-protein mappings (25). We used a Bayesian network to integrate heterogeneous information: (i) protein-to-domain mappings calculated using all-against-all structural alignment of SCOP domains and protein structures from the PDB database; (ii) SCOP-to-InterPro domain mappings calculated using the InterProScan software; (iii) SVM models generated based on the position-specific scoring matrix (PSSM) profiles; (iv) sequence homologs mapped to SCOP domains using a Bayesian network. We showed the advantages of integrating large-scale structure-based mappings and other heterogeneous information sources for structural domain function prediction.

Here, we present the SDA database, which provides domain GO annotations predicted from our integrated method, and also includes links to other databases. The server allows users to query functional annotations for input proteins or domains. The results can be visualized in an interactive 3D viewer and a tree viewer. SDADB is available at http://sda.denglab.org.

Methods and data sources

Structural domains are downloaded from SCOPe version 2.06 (16, 17). GO annotations for the SCOP domains are generated by our structure-based integrative function prediction approach that combines structural mappings with other sequence and evolutionary clues (25). A detailed illustration of the data sources and framework is shown in Figure 1. Briefly, for a query SCOP domain, GO annotations are predicted with four component methods (structure-based, InterPro-based, PSSM-based and sequence homology-based methods). A probability for each annotation is calculated using a Bayesian network trained on a dataset of SCOP domains (ending in dash) generated from single-domain proteins.

Figure 1.

Flowchart of SDADB construction. For each domain in the SCOPe database, GO annotations are predicted with the four component methods: (i) GO annotations predicted using P2D mappings: protein-SCOP domain mappings are calculated by large-scale structure alignment, then the probability that a domain annotated by a specific function is computed; (ii) GO annotations predicted using Scop-InterPro mappings: we use InterProScan to search InterPro domains for the SCOP domain, and transfer the annotations of these InterPro domains in the InterPro2GO database to the target SCOP domain; (iii) GO annotations predicted using PSSM profiles: SVM models for GO function annotation are trained with fixed length of PSSM vectors, which are calculated using ACC transformation; (iv) GO annotations predicted using sequence homologs: we transfer the GO annotations of the sequence homologs in UniProt-GOA to the target SCOP domain. Finally, the SDADB database is built by integrating the outputs of the four component methods with a Bayesian network.

Open in new tab Download slide

GO annotations predicted using protein-domain structural mappings

We use a structure alignment algorithm (26, 27) to search structural neighbors for SCOP domains against the PDB database, and obtain a significant number of protein-domain (P2D) mappings. The structural similarity between proteins and domains is measured by protein structure distance (PSD) (26). The PSD score integrates the RMSD (root mean square deviation) and the secondary structural alignment score to measure the similarity of two structures and is applicable both when two structures are very similar, and when they are very different. Lower PSD score corresponds to a good fit or better alignment between the structures. A protein is defined as the structural neighbor of a SCOP domain when the PSD score is lower than 0.1. We assign GO annotations to SCOP domains based on the assumption that the most populated SCOP domain in the mappings corresponds to the structural neighbor proteins which are responsible for the function (25). The GO annotations of proteins are extracted from the PDB-GOA database (28). The probabilities of transferring protein function annotations to SCOP domains are computed as shown in the following equation:

P_{1} (g | d) = \frac{P (d, g)}{P (d)},

where d is a SCOP domain, g denotes a GO term, P(d, g) is the number of PDB proteins containing domain d and having function g, P(d) is the number of PDB proteins containing domain d calculated on the P2D mapping data.

GO annotations predicted using Scop-InterPro domain mappings

We use the InterProScan tool (6) to search the InterPro domains (5) against the SCOP sequences. GO terms of InterPro domains in the InterPro2GO database (28) are transferred to the corresponding SCOP domains. The probability of a GO term (g) assigned to the SCOP domain is calculated based on the number of InterPro domains that have the function:

P_{2} (g) = \frac{1}{n} {\sum^{}}_{i = 1}^{n} I (g),

where n is the number of InterPro domains owned by the SCOP domain. If an InterPro domain has the function g, I(g) is 1; otherwise 0.

GO annotations predicted using PSSM profiles

PSSM (29–32) is a highly informative representation of protein sequences and is widely utilized in many applications. We use PSI-BLAST (33) to calculate PSSM profiles based on the NCBI NR database (34). The auto-cross covariance (ACC) transformation (35) is used to transform the PSSM profiles into fixed-length vectors. For a domain sequence, auto covariance (AC) describes the average interactions between residues, a certain distance (l) apart throughout the whole sequence. For a descriptor (one of the 20 basic residue types), the AC variable is calculated as:

A C = \frac{1}{D L - l} {\sum^{}}_{i = 1}^{D L - l} (X_{i} - \bar{X}) (X_{i + l} - \bar{X}),

where l denotes the distance between one residue and its neighbor, DL is the length of domain sequence, X_i is the PSSM score of the descriptor at position i,

\bar{X}

is the average score for the descriptor along the whole sequence. We use the AC variables to transform the numerical PSSM vectors of SCOP domain sequences into uniform matrices with the distance l = 10. Based on the PSSM vectors, we build SVM classifiers for each GO term. The probability score of a GO term (g) assigned to a SCOP domain is estimated based on the output of the SVM classifier by a sigmoid function:

P_{3} (g) \approx \frac{1}{1 + e^{A f (g) + B}},

where f(g) is the SVM output score, A and B are estimated using the method of Lin et al. (36).

GO annotations predicted using sequence homologs

We also used PSI-BLAST to search sequence homologs for SCOP domains against the UniProt database. We only select the sequence homologs with alignment coverage >60%. The GO annotations of sequence homologs are obtained from the UniProt-GOA database (37). We use the sequence homolog’s E-value (E) to estimate the weight of its GO term assigned to the query SCOP domain. The probability for a GO term (g) assigned to the query domain is the sum of weights of sequence homologs that have the GO annotation in UniProt-GOA:

P_{4} (g) = \sum_{i = 1}^{n} \frac{- \log (E_{i}) + b}{\sum_{j = 1}^{n} \{- \log (E_{j}) + b\}} \cdot I (g)

where n is the number of sequence homologs, b is a constant of log(10). If the sequence homolog has the function g, I(g) is 1; otherwise 0.

Integration of the four component methods

We combine the output scores of the four component approaches using a Bayesian network (38). The Bayesian network represents the joint probability distribution of multiple variables and is especially suitable for integrating heterogeneous data sources. We split the probability scores of the four component methods into individual bins. The likelihood ratio (LR) (39) for any bin is calculated as the ratio of the odds of a GO annotation to be true or false after and before knowing it is in this bin. The LR represents the increase of the chance that a GO prediction method with a particular set of scores corresponds to a positive GO annotation, compared with a random GO assignment. The final probability score is calculated by integrating the LR of the four component methods:

L R = \prod_{i}^{N} L R (f_{i}),

where f is the component method.

The final predicted GOA-SCOP (GO annotation for SCOP domains) data are stored in an MYSQL database. The website is developed using Perl, JavaScript, jQuery (AJAX), CSS and HTML5, and is deployed on an Apache web server. The BioJava (40) and JSmol (41) provide visuals of the P2D alignment. D3 (D3.js) (42) is used for visualizing GO hierarchical tree.

Web server interface

The SDADB database can be queried through the protein/domain accession number (e.g. 1te0/d1te0a2) or protein/domain name (e.g. Stress sensor protease DegS). The server will return a list of GO annotations with corresponding confidence scores (Figure 2A). The detailed annotation information, including GO accession ID, GO type, GO name and associated score, are shown in the table. The associated score denotes the probability of the SCOP domain certain having the GO function. The higher the score is, the more likely the SCOP domain has the function. The default cut-off of the associated score is 0.5. The GO annotations with scores over the cut-off are colored in red. The users can change the threshold according to their own needs. Also, users can search GO annotations in the results and download the detailed results in Excel or CSV file.

Figure 2.

A snapshot of the SDADB web interface. (A) The GO annotations of a query domain are listed. (B) The GO tree view shows the hierarchical architecture of GO for the query domain.

Open in new tab Download slide

Users can view the annotations by clicking the ‘view GO tree’ button, which shows the hierarchical architecture of GO (Figure 2B). Users can expand or collapse the term nodes. The red nodes in the tree are annotated GO terms of the target SCOP domain. Users can view the GO name by putting the mouse over the node. Another unique feature is the visualization of structure alignment for the domain-protein mappings, which constitutes a major contribution to the function prediction. Users can choose to view the alignment between the target domain and its structural neighbors in 3D view (Figure 3).

Figure 3.

The structure alignment view for the domain-protein mappings.

Open in new tab Download slide

Results

To evaluate the accuracy of domain functional annotations, we use the dataset obtained from GOA-PDB version 201010 in training and the independent test set from GOA-PDB version 201311 excluding those in GOA-PDB version 201010 for testing. Proteins of the test set that have >90% sequence identity to the proteins in the training set are removed. We use the precision–recall curve and maximum F-measure (F_max) to measure the overall performance. The precision–recall curve shows the trade-off between precision and recall for different thresholds. A high area under the precision–recall curve denotes high overall performance. F-measure considers both the precision and the recall of the GO prediction results of SCOP domains. It is calculated as the harmonic average of the precision and recall. Maximum F-measure (F_max) is the maximum value of the F-measure over a varying threshold. The coverage is computed by dividing the number of domains with predicted GO annotations by the total number of domains in SCOPe 2.06. For detailed descriptions of the datasets and performance measures, see Reference (25).

We compare our SDADB database with the four component methods, including structure alignment-based method (Str), Interpro domain-based method (IPR), PSSM profile-based method (PSSM) and sequence homolog-based approach (Seq). The results are summarized in Table 1. We observe that the combined SDADB significantly outperforms the four component methods, with a maximum F-score of 0.833 for MF (molecular function), a maximum F-score of 0.723 for BP (biological process) and a maximum F-score of 0.809 for CC (cellular component). For the coverage, SDADB has GO annotations for most SCOP domains (92.3%). We also compare SDADB with other state-of-the-art approaches on the independent test dataset. As shown in Figure 4, it is clear that SDADB significantly outperforms SDA and other methods for both MF and BP.

Table 1.

Prediction performance comparison of SDADB with the four component methods

Methods	Coverage	Fmax (MF)	F_max (BP)	F_max (CC)
Str	0.802	0.806	0.690	0.767
Blast	0.918	0.755	0.616	0.687
PSSM	0.509	0.491	0.364	0.507
IPR	0.555	0.711	0.508	0.263
SDADB	0.923	0.833	0.723	0.809

Methods	Coverage	Fmax (MF)	F_max (BP)	F_max (CC)
Str	0.802	0.806	0.690	0.767
Blast	0.918	0.755	0.616	0.687
PSSM	0.509	0.491	0.364	0.507
IPR	0.555	0.711	0.508	0.263
SDADB	0.923	0.833	0.723	0.809

Table 1.

Prediction performance comparison of SDADB with the four component methods

Methods	Coverage	Fmax (MF)	F_max (BP)	F_max (CC)
Str	0.802	0.806	0.690	0.767
Blast	0.918	0.755	0.616	0.687
PSSM	0.509	0.491	0.364	0.507
IPR	0.555	0.711	0.508	0.263
SDADB	0.923	0.833	0.723	0.809

Methods	Coverage	Fmax (MF)	F_max (BP)	F_max (CC)
Str	0.802	0.806	0.690	0.767
Blast	0.918	0.755	0.616	0.687
PSSM	0.509	0.491	0.364	0.507
IPR	0.555	0.711	0.508	0.263
SDADB	0.923	0.833	0.723	0.809

Figure 4.

Precision–recall curve of SDADB versus existing methods for molecular function (A) and biological process (B).

Open in new tab Download slide

Conclusion

The SDADB database provides large-scale detailed GO annotations at the structural domain level. In contrast to the approaches based on sequence and homology information, an advantage of SDADB is that the method integrates structural neighborhood features together with a variety of heterogeneous information, including SCOP-InterPro domain mapping information, PSSMs and sequence homolog features. The SDADB database now contains 3 482 316 GO annotations for 211 282 SCOP domains with a probability >0.1. Of these, 1 479 652 annotations for 204 948 domains have a probability >0.5. Also, SDADB provides P2D mappings for over 191 060 PDB structures. The vast amount of P2D and domain-function mapping data in the SDADB database can help to investigate the functions of full-length proteins since domains are functional units of proteins. The database will also give valuable insights into protein domain evolution, which are not only likely to be fascinating but will also ultimately improve the power and accuracy of protein function prediction approaches.

It is worth pointing out that some common and multifunctional domains may be not well annotated since the presence of a common domain in several proteins does not necessarily imply that these proteins have the same function. Future developments will focus on combining more informative clues and analyzing tools. We also expect the interested user will be able to use the resources provided in the SDADB database as a basis for new efforts on expanding the functional space for both domains and full-length proteins.

Availability

The SDADB database is freely available at http://sda.denglab.org/.

Funding

This work was supported by National Natural Science Foundation of China (61672541); Natural Science Foundation of Hunan Province (2017JJ3287); Natural Science Foundation of Zhejiang (LY13F020038); Fundamental Research Funds for the Central Universities of Central South University (2017zzts727) and Shanghai Key Laboratory of Intelligent Information Processing (IIPL-2014-002).

Conflict of interest. None declared.

References

Marcotte

E.M.

Pellegrini

H.-L.

et al. (

1999

)

Detecting protein function and protein-protein interactions from genome sequences

Science

285

751

–

753

Hanks

S.K.

Quinn

A.M.

Hunter

(

1988

)

The protein kinase family: conserved features and deduced phylogeny

Science

241

–

Knighton

D.R.

Zheng

Lynn

et al. (

1991

)

Crystal structure of the catalytic subunit of cyclic adenosine monophosphate-dependent protein kinase

Science

407

–

414

Google Scholar

OpenURL Placeholder Text

WorldCat

Zou

Chen

Huang

et al. (

2013

)

Identifying multi-functional enzyme by hierarchical multi-label classifier

J. Comput. Theor. Nanosci

1038

–

1043

Google Scholar

Crossref

WorldCat

Finn

R.D.

Attwood

T.K.

Babbitt

P.C.

et al. (

2017

)

InterPro in 2017—beyond protein family and domain annotations

Nucleic Acids Res

D190

–

D199

Zdobnov

E.M.

Apweiler

(

2001

)

InterProScan—an integration platform for the signature-recognition methods in InterPro

Bioinformatics

847

–

848

Bateman

Coin

Durbin

et al. (

2004

)

The Pfam protein families database

Nucleic Acids Res

D138

–

D141

Sillitoe

Lewis

T.E.

Cuff

et al. (

2015

)

CATH: comprehensive structural and functional annotations for genome sequences

Nucleic Acids Res

D376

–

D381

Letunic

Doerks

Bork

(

2012

)

SMART 7: recent updates to the protein domain annotation resource

Nucleic Acids Res

D302

–

D305

Bru

Courcelle

Carrère

et al. (

2005

)

The ProDom database of protein domain families: more emphasis on 3D

Nucleic Acids Research

D212

–

D215

Oates

M.E.

Stahlhacke

Vavoulis

D.V.

et al. (

2015

)

The SUPERFAMILY 1.75 database in 2014: a doubling of data

Nucleic Acids Res

D227

–

D233

Hulo

Bairoch

Bulliard

et al. (

2006

)

The PROSITE database

Nucleic Acids Res

D227

–

D230

Marchler-Bauer

Derbyshire

M.K.

Gonzales

N.R.

et al. (

2015

)

CDD: nCBI's conserved domain database

Nucleic Acids Res

D222

–

D226

Murzin

A.G.

Brenner

S.E.

Hubbard

Chothia

(

1995

)

SCOP: a structural classification of proteins database for the investigation of sequences and structures

J. Mol. Biol

247

536

–

540

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

Wei

Zou

(

2016

)

Recent progresses in machine learning-based methods for protein fold recognition

Int. J. Mol. Sci

2118.

Google Scholar

Crossref

WorldCat

Chandonia

J.-M.

Fox

N.K.

Brenner

S.E.

(

2017

)

SCOPe: manual curation and artifact removal in the structural classification of proteins–extended database

J. Mol. Biol

429

348

–

355

Fox

N.K.

Brenner

S.E.

Chandonia

J.-M.

(

2014

)

SCOPe: structural classification of proteins—extended, integrating SCOP and ASTRAL data and classification of new structures

Nucleic Acids Res

D304

–

D309

Zhang

Wang

et al. (

2018

)

Ontological function annotation of long non-coding RNAs through hierarchical multi-label classification

Bioinformatics

1750

–

1757

Zhang

Fan

et al. (

2017

)

KATZLGO: large-scale prediction of LncRNA functions by using the KATZ measure based on multiple networks

IEEE/ACM Trans. Comput. Biol. Bioinform

., doi: 10.1109/TCBB.2017.2704587.

Google Scholar

OpenURL Placeholder Text

WorldCat

Burge

Kelly

Lonsdale

et al. (

2012

)

Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation

Database

2012

, doi: 10.1093/database/bar068.

Google Scholar

OpenURL Placeholder Text

WorldCat

Forslund

Sonnhammer

E.L.L.

(

2008

)

Predicting protein function from domain content

Bioinformatics

1681

–

1687

Rentzsch

Orengo

C.A.

(

2013

)

Protein function prediction using domain families

BMC Bioinformatics

S5.

Lopez

Pazos

(

2009

)

Gene ontology functional annotations at the structural domain level

Proteins Struct. Funct. Bioinform

598

–

607

Google Scholar

Crossref

WorldCat

Fang

Gough

(

2013

)

dcGO: database of domain-centric ontologies on functions, phenotypes, diseases and more

Nucleic Acids Res

D536

–

D544

Deng

Chen

(

2015

)

An integrated framework for functional annotation of protein structural domains

IEEE/ACM Trans. Comput. Biol. Bioinform

902

–

913

Yang

A.-S.

Honig

(

2000

)

An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance

J. Mol. Biol

301

665

–

678

Zhang

Q.C.

Petrey

D., L.

Deng

L.Q.

Shi

et al. (

2012

)

Structure-based prediction of protein-protein interactions on a genome-wide scale

Nature

490

556.

Camon

Magrane

Barrell

et al. (

2004

)

The gene ontology annotation (GOA) database: sharing knowledge in uniprot with gene ontology

Nucleic Acids Res

262D

–

D266

Google Scholar

Crossref

WorldCat

Jones

D.T.

(

1999

)

Protein secondary structure prediction based on position-specific scoring matrices

J. Mol. Biol

292

195

–

202

Fan

Liu

Huang

et al.

PredRSA: a gradient boosted regression trees approach for predicting protein solvent accessibility

BMC Bioinformatics

Wei

Tang

Zou

(

2017

)

Local-DPP: an improved DNA-binding protein prediction method by exploring local evolutionary information

Inform. Sci

384

135

–

144

Google Scholar

Crossref

WorldCat

Pan

Wang

Zhan

et al. (

2018

)

Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach

Bioinformatics

1473

–

1480

Altschul

S.F.

Madden

T.L.

Schäffer

A.A.

et al. (

1997

)

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Nucleic Acids Res

3389

–

3402

Godzik

(

2006

)

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences

Bioinformatics

1658

–

1659

Guo

Wen

et al. (

2008

)

Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences

Nucleic Acids Res

3025

–

3030

Lin

H.-T.

Lin

C.-J.

Weng

R.C.

(

2007

)

A note on Platt’s probabilistic outputs for support vector machines

Mach. Learn

267

–

276

Google Scholar

Crossref

WorldCat

Dimmer

E.C.

Huntley

R.P.

Alam-Faruque

et al. (

2012

)

The UniProt-GO annotation database in 2011

Nucleic Acids Res

D565

–

D570

Friedman

Geiger

Goldszmidt

(

1997

)

Bayesian network classifiers

Mach. Learn

131

–

163

Google Scholar

Crossref

WorldCat

Jansen

Greenbaum

et al. (

2003

)

A Bayesian networks approach for predicting protein-protein interactions from genomic data

Science

302

449

–

453

Prlić

Yates

Bliven

S.E.

et al. (

2012

)

BioJava: an open-source framework for bioinformatics in 2012

Bioinformatics

2693

–

2695

Hanson

R.M.

Prilusky

Renjian

et al. (

2013

)

JSmol and the Next-generation web-based representation of 3D molecular structure as applied to proteopedia

Israel J. Chem

207

–

216

Google Scholar

Crossref

WorldCat

Ogievetsky

Heer

Bostock

(

2011

)

D3 data-driven documents

IEEE Trans. Vis. Comput. Graph

2301

–

2309

Author notes

Citation details: Zeng,C., Zhan,W., Deng,L. SDADB: a functional annotation database of protein structural domains. Database (2018) Vol. 2018: article ID bay064; doi:10.1093/database/ bay064

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
June 2018	60
July 2018	672
August 2018	158
September 2018	32
October 2018	34
November 2018	146
December 2018	40
January 2019	46
February 2019	26
March 2019	25
April 2019	99
May 2019	40
June 2019	33
July 2019	28
August 2019	25
September 2019	36
October 2019	53
November 2019	31
December 2019	28
January 2020	47
February 2020	22
March 2020	21
April 2020	15
May 2020	29
June 2020	101
July 2020	76
August 2020	18
September 2020	21
October 2020	13
November 2020	26
December 2020	13
January 2021	16
February 2021	17
March 2021	32
April 2021	33
May 2021	19
June 2021	13
July 2021	23
August 2021	18
September 2021	28
October 2021	28
November 2021	37
December 2021	15
January 2022	32
February 2022	27
March 2022	37
April 2022	29
May 2022	36
June 2022	43
July 2022	26
August 2022	38
September 2022	16
October 2022	19
November 2022	34
December 2022	17
January 2023	24
February 2023	24
March 2023	37
April 2023	13
May 2023	18
June 2023	15
July 2023	14
August 2023	21
September 2023	17
October 2023	26
November 2023	16
December 2023	32
January 2024	31
February 2024	31
March 2024	18
April 2024	39
May 2024	24
June 2024	21
July 2024	19
August 2024	13
September 2024	20
October 2024	31
November 2024	31
December 2024	20
January 2025	10
February 2025	13
March 2025	22
April 2025	7
May 2025	15
June 2025	9
July 2025	10
August 2025	23
September 2025	9
October 2025	16
November 2025	16
December 2025	7
January 2026	4
February 2026	5
March 2026	6
April 2026	1

Article Contents

SDADB: a functional annotation database of protein structural domains

Abstract

Introduction

Methods and data sources

GO annotations predicted using protein-domain structural mappings

GO annotations predicted using Scop-InterPro domain mappings

GO annotations predicted using PSSM profiles

GO annotations predicted using sequence homologs

Integration of the four component methods

Web server interface

Results

Conclusion

Availability

Funding

References

Author notes

Citations

Views

Altmetric

Citing articles via

Latest

Most Read

Most Cited

Article Contents

SDADB: a functional annotation database of protein structural domains Open Access

Abstract

Introduction

Methods and data sources

GO annotations predicted using protein-domain structural mappings

GO annotations predicted using Scop-InterPro domain mappings

GO annotations predicted using PSSM profiles

GO annotations predicted using sequence homologs

Integration of the four component methods

Web server interface

Results

Conclusion

Availability

Funding

References

Author notes

Citations

Views

Altmetric

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Gift article access

Gift article access

Gift article access

Gift article access

SDADB: a functional annotation database of protein structural domains