TparvaDB: a database to support Theileria parva vaccine development

Abstract

We describe the development of TparvaDB, a comprehensive resource to facilitate research towards development of an East Coast fever vaccine, by providing an integrated user-friendly database of all genome and related data currently available for Theileria parva. TparvaDB is based on the Generic Model Organism Database (GMOD) platform. It contains a complete reference genome sequence, Expressed Sequence Tags (ESTs), Massively Parallel Signature Sequencing (MPSS) expression tag data and related information from both public and private repositories. The Artemis annotation workbench provides online annotation functionality. TparvaDB represents a resource that will underpin and promote ongoing East Coast fever vaccine development and biological research.

Database URL:http://tparvadb.ilri.cgiar.org

Introduction

Theileria parva is a tick-transmitted haemoprotozoan parasite that causes an acute and often fatal disease of cattle, East Coast fever (ECF) (1). ECF severely constrains the livelihoods of poor livestock-keepers in sub-Saharan Africa. Current control methods include use of acaricides to limit tick populations, drug-treatment of cattle exhibiting clinical symptoms and deployment of a live vaccine that involves infection with a potentially lethal dose of cryo-preserved sporozoites and simultaneous treatment with long-acting oxy-tetracycline (2). A subunit vaccine will provide a long-term solution to this socio-economically important constraint to livestock development in Eastern and Southern Africa.

The completion of the genome sequence of T. parva Muguga (3) represents an important milestone in research on the parasite biology, and has contributed to the identification of candidate schizont antigens for vaccine development targeting this stage of the parasite (4). It is also an important resource for apicomplexan comparative genomics, in particular with Plasmodium falciparum, which causes malaria in humans and T. annulata (5), the cause of tropical bovine Theileriosis, a related disease of cattle that has a wide range in North Africa and South and East Asia. To date the utilization of the T. parva genome and associated information by interested scientists has been limited by the lack of a user-friendly interface that provides access not only to genome data but the large set of expression data from MPSS (6) and EST data for the schizont stage and additional unpublished and published microarray expression data (7). In addition, the current system of access does not easily allow for updating of annotation and curation as new data becomes available (8).

TparvaDB architecture

To address the need to provide a comprehensive resource to facilitate research in the development of an ECF vaccine and comparative Apicomplexan genomics, TparvaDB was developed to provide a unified platform for all genome and related T. parva data (Figure 1). TparvaDB was built using components of the Generic Model Organism Database (GMOD) system (http://www.gmod.org). The core component of TparvaDB is Chado, an ontology driven relational database schema (9), using the PostgreSQL open source database management system (http://www.postgresql.org). Ontologies used in TparvaDB include Sequence (10), Gene (11) and Relationship (12) ontologies.

Figure 1.

Screenshot of TparvaDB interface providing access to several online tools.

Open in new tab Download slide

Theileria parva Muguga genome data in GenBank format was converted to GFF3 format and loaded into the TparvaDB Chado database using Perl scripts. EST, MPSS and SignalP data was similarly converted to GFF3 format and uploaded to the database. Several tools were included in TparvaDB to provide functionality. These include, GBrowse, NCBI Blast and an online annotation tool Artemis (13).

GBrowse, is a platform independent, extensible web-based graphical interface application for the visualization of genomic features stored in a database (14). In TparvaDB, GBrowse utilizes the Bio::DB::Das::Chado perl interface to connect directly to the TparvaDB Chado database, and the Bio::Graphics perl libraries to create images. GBrowse allows a simultaneous genome overview combined with the facility for a detailed view of specific regions of the genome. A user can query chromosomal regions of interest and visualise specific features such as annotated genes, ESTs and MPSS data mapped to a specific region (Figure 2). Online data analysis plug-ins provides detailed information on specific features that can be downloaded.

Figure 2.

A representative example of the GBrowse interface. Tracks display information on annotated genes, predicted signal peptides, mapped ESTs, %GC content and MPSS signatures in a region of Chromosome 1.

Open in new tab Download slide

The NCBI Blast (15) tool allows homology searches against either nucleotide or protein sequences from T. parva (Muguga). In future, database releases data for other T. parva isolates that are currently being sequenced, and will be included as they are completed.

We implemented an online annotation functionality using Artemis, a DNA sequence viewer and annotation tool (13). Artemis supports Chado databases through the use of the iBATIS DataMapper API (http://ibatis.apache.org), allowing real-time connection to the underlying TparvaDB Chado database. Users with granted permissions can login to the Chado database remotely and make changes to T. parva gene models online through the Artemis interface (Figure 3).

Figure 3.

Screenshot of Artemis annotation tool allowing direct access to T. parva genome data in the TparvaDB database for online annotation. Background image depicts Artemis interface and foreground image depict Gene Builder view.

Open in new tab Download slide

Theileria parva re-annotation

As mentioned several additional T. parva isolates are currently being sequenced which will enable comparative genomics analysis to aid ECF vaccine research. One of the most effective methods to annotate a newly sequenced genome is to compare it with a well-annotated closely related genome using computational tools and databases. In order to improve the quality of the reference genome, we re-annotated T. parva Muguga genome using data that had become available since the original annotation, most importantly a large EST data set generated since the first genome was sequenced. In this exercise, we used TparvaDB to re-annotate the T. parva Muguga genome using the new data. Theileria parva schizont and sporozoite EST data were downloaded from NCBI dbEST database (16), mapped to the T. parva genome using both PASA (17) and BLAT (18), and results loaded into TparvaDB. The original T. parva genome data and the mapped EST data were visualized and gene models manually annotated using Artemis. In total, 13 new gene models were created and 23 gene models were corrected as shown in Table 1.

Table 1.

Open in new tab

Annotation changes based on EST-based re-annotation

Chromosome	Gene model	Change	Comment
1	TP01_0115	Modified CDS
1	TP01_0289	Modified CDS	Exon added
1	TP01_0869	Modified CDS	Exon added
1	TP01_1252	New CDS
1	TP01_1253	New CDS
1	TP01_1254	Modified CDS	Exon added
1	TP01_1255	New CDS
2	TP02_0127	Modified CDS	Exon added
2	TP02_0140	Modified CDS	Exon added
2	TP02_0173	Modified CDS	Exon added
2	TP02_0181	Modified CDS	Exon added
2	TP02_0209	Modified CDS	Exon added
2	TP02_0236	Modified CDS	Exon added
2	TP02_0247	Modified CDS	Exon added
2	TP02_0942	Modified CDS	Exon added
2	TP02_0980	New CDS
2	TP02_0981	New CDS
2	TP02_0982	New CDS
2	TP02_0983	New CDS
2	TP02_0984	New CDS
2	TP02_0986	New CDS
3	TP03_0029	Modified CDS	Exon added
3	TP03_0041	Modified CDS	Exon added
3	TP03_0224	Modified CDS	Four exons added
3	TP03_0349	Modified CDS	Exon added
3	TP03_0413	Modified CDS	Exon added
3	TP03_0542	Modified CDS	Exon added
3	TP03_0574	Modified CDS	Exon added
3	TP03_0792	Modified CDS	Exon added
3	TP03_0862	Modified CDS	Exon added
3	TP03_0945	New CDS
3	TP03_0946	New CDS
4	TP04_0943	New CDS
4	TP04_0566	Modified CDS	Exon added
4	TP04_0945	New CDS
4	TP04_0843	Modified CDS	Exon added

Chromosome	Gene model	Change	Comment
1	TP01_0115	Modified CDS
1	TP01_0289	Modified CDS	Exon added
1	TP01_0869	Modified CDS	Exon added
1	TP01_1252	New CDS
1	TP01_1253	New CDS
1	TP01_1254	Modified CDS	Exon added
1	TP01_1255	New CDS
2	TP02_0127	Modified CDS	Exon added
2	TP02_0140	Modified CDS	Exon added
2	TP02_0173	Modified CDS	Exon added
2	TP02_0181	Modified CDS	Exon added
2	TP02_0209	Modified CDS	Exon added
2	TP02_0236	Modified CDS	Exon added
2	TP02_0247	Modified CDS	Exon added
2	TP02_0942	Modified CDS	Exon added
2	TP02_0980	New CDS
2	TP02_0981	New CDS
2	TP02_0982	New CDS
2	TP02_0983	New CDS
2	TP02_0984	New CDS
2	TP02_0986	New CDS
3	TP03_0029	Modified CDS	Exon added
3	TP03_0041	Modified CDS	Exon added
3	TP03_0224	Modified CDS	Four exons added
3	TP03_0349	Modified CDS	Exon added
3	TP03_0413	Modified CDS	Exon added
3	TP03_0542	Modified CDS	Exon added
3	TP03_0574	Modified CDS	Exon added
3	TP03_0792	Modified CDS	Exon added
3	TP03_0862	Modified CDS	Exon added
3	TP03_0945	New CDS
3	TP03_0946	New CDS
4	TP04_0943	New CDS
4	TP04_0566	Modified CDS	Exon added
4	TP04_0945	New CDS
4	TP04_0843	Modified CDS	Exon added

Table 1.

Open in new tab

Annotation changes based on EST-based re-annotation

Chromosome	Gene model	Change	Comment
1	TP01_0115	Modified CDS
1	TP01_0289	Modified CDS	Exon added
1	TP01_0869	Modified CDS	Exon added
1	TP01_1252	New CDS
1	TP01_1253	New CDS
1	TP01_1254	Modified CDS	Exon added
1	TP01_1255	New CDS
2	TP02_0127	Modified CDS	Exon added
2	TP02_0140	Modified CDS	Exon added
2	TP02_0173	Modified CDS	Exon added
2	TP02_0181	Modified CDS	Exon added
2	TP02_0209	Modified CDS	Exon added
2	TP02_0236	Modified CDS	Exon added
2	TP02_0247	Modified CDS	Exon added
2	TP02_0942	Modified CDS	Exon added
2	TP02_0980	New CDS
2	TP02_0981	New CDS
2	TP02_0982	New CDS
2	TP02_0983	New CDS
2	TP02_0984	New CDS
2	TP02_0986	New CDS
3	TP03_0029	Modified CDS	Exon added
3	TP03_0041	Modified CDS	Exon added
3	TP03_0224	Modified CDS	Four exons added
3	TP03_0349	Modified CDS	Exon added
3	TP03_0413	Modified CDS	Exon added
3	TP03_0542	Modified CDS	Exon added
3	TP03_0574	Modified CDS	Exon added
3	TP03_0792	Modified CDS	Exon added
3	TP03_0862	Modified CDS	Exon added
3	TP03_0945	New CDS
3	TP03_0946	New CDS
4	TP04_0943	New CDS
4	TP04_0566	Modified CDS	Exon added
4	TP04_0945	New CDS
4	TP04_0843	Modified CDS	Exon added

Chromosome	Gene model	Change	Comment
1	TP01_0115	Modified CDS
1	TP01_0289	Modified CDS	Exon added
1	TP01_0869	Modified CDS	Exon added
1	TP01_1252	New CDS
1	TP01_1253	New CDS
1	TP01_1254	Modified CDS	Exon added
1	TP01_1255	New CDS
2	TP02_0127	Modified CDS	Exon added
2	TP02_0140	Modified CDS	Exon added
2	TP02_0173	Modified CDS	Exon added
2	TP02_0181	Modified CDS	Exon added
2	TP02_0209	Modified CDS	Exon added
2	TP02_0236	Modified CDS	Exon added
2	TP02_0247	Modified CDS	Exon added
2	TP02_0942	Modified CDS	Exon added
2	TP02_0980	New CDS
2	TP02_0981	New CDS
2	TP02_0982	New CDS
2	TP02_0983	New CDS
2	TP02_0984	New CDS
2	TP02_0986	New CDS
3	TP03_0029	Modified CDS	Exon added
3	TP03_0041	Modified CDS	Exon added
3	TP03_0224	Modified CDS	Four exons added
3	TP03_0349	Modified CDS	Exon added
3	TP03_0413	Modified CDS	Exon added
3	TP03_0542	Modified CDS	Exon added
3	TP03_0574	Modified CDS	Exon added
3	TP03_0792	Modified CDS	Exon added
3	TP03_0862	Modified CDS	Exon added
3	TP03_0945	New CDS
3	TP03_0946	New CDS
4	TP04_0943	New CDS
4	TP04_0566	Modified CDS	Exon added
4	TP04_0945	New CDS
4	TP04_0843	Modified CDS	Exon added

Future development of T. parva DB

To expedite selection of potential vaccine targets, and to determine the impact and potential limitations of extensive vaccine deployment, additional T. parva genomes are being sequenced. These new genomes can be quickly annotated using TparvaDB as they become available. We plan to incorporate a comparative genomics component to TparvaDB to facilitate identification of conserved and divergent regions between isolates. The comparative genomics component could either be implemented through GBrowse_syn (http://gmod.org/wiki/GBrowse_syn), a GBrowse-based synteny browser or by interfacing TparvaDB with other comparative database systems such as Sybil (http://sybil.sourceforge.net/). In addition, the ECF vaccine development project is generating immunological data on additional candidate antigens, bovine class I MHC sequences and other immunological data that can also be incorporated in TparvaDB. To manage this increasingly sophisticated information more efficiently, we plan to implement a query system based on BioMart, a query-oriented data management system that provides ‘data mining’-like searches of complex descriptive data (19).

Funding

Multilateral core funding provided to the International Livestock Research Institute supported this work. This is ILRI publication number IL-201101. Funding for open access charge: International Livestock Research Institute.

Conflict of interest. None declared.

References

Norval

RAI

Perry

Young

. ,

The Epidemiology of Theileriosis in Africa

1992

London

Academic Press

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Radley

. ,

Infection and Treatment Method of Immunization Against Theileriosis

1981

The Hague

Martinus Nijhoff Publishers

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Gardner

Bishop

Shah

, et al. ,

Genome sequence of Theileria parva, a bovine pathogen that transforms lymphocytes

Science

2005

, vol.

309

(pg.

134

137

)

Graham

Pelle

Honda

, et al. ,

Theileria parva candidate vaccine antigens recognized by immune bovine cytotoxic T lymphocytes

Proc. Natl Acad. Sci. USA

2006

, vol.

103

(pg.

3286

3291

)

Google Scholar

Crossref

WorldCat

Pain

Renauld

Berriman

, et al. ,

Genome of the host-cell transforming parasite Theileria annulata compared with T. parva

Science

2005

, vol.

309

(pg.

131

133

)

Bishop

Shah

Pelle

, et al. ,

Analysis of the transcriptome of the protozoan Theileria parva using MPSS reveals that the majority of genes are transcriptionally active in the schizont stage

Nucleic Acids Res.

2005

, vol.

(pg.

5503

5511

)

Schmuckli-Maurer

Casanova

Schmied

, et al. ,

Expression analysis of the Theileria parva subtelomere-encoded variable secreted protein gene family

PLoS One

2009

, vol.

pg.

e4839

Shah

de Villiers

Nene

, et al. ,

Using the transcriptome to annotate the genome revisited: application of massively parallel signature sequencing (MPSS)

Gene

2006

, vol.

366

(pg.

104

108

)

Mungall

Emmert

. ,

A Chado case study: an ontology-based modular schema for representing genome-associated biological information

Bioinformatics

2007

, vol.

(pg.

i337

i346

)

Eilbeck

Lewis

Mungall

, et al. ,

The Sequence Ontology: a tool for the unification of genome annotations

Genome Biol.

2005

, vol.

pg.

R44

Ashburner

Ball

Blake

, et al. ,

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium

Nat Genet.

2000

, vol.

(pg.

)

Smith

Ceusters

Klagges

, et al. ,

Relations in biomedical ontologies

Genome Biol.

2005

, vol.

pg.

R46

Carver

Berriman

Tivey

, et al. ,

Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database

Bioinformatics

2008

, vol.

(pg.

2672

2676

)

Stein

Mungall

Shu

, et al. ,

The generic genome browser: a building block for a model organism system database

Genome Res.

2002

, vol.

(pg.

1599

1610

)

Altschul

Madden

Schaffer

, et al. ,

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Nucleic Acids Res.

1997

, vol.

(pg.

3389

3402

)

Rodriguez-Tome

. ,

Searching the dbEST database

Methods Mol. Biol.

1997

, vol.

(pg.

269

283

)

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

Haas

Delcher

Mount

, et al. ,

Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies

Nucleic Acids Res.

2003

, vol.

(pg.

5654

5666

)

Kent

. ,

BLAT–the BLAST-like alignment tool

Genome Res.

2002

, vol.

(pg.

656

664

)

Smedley

Haider

Ballester

, et al. ,

BioMart–biological queries made easy

BMC Genomics

2009

, vol.

pg.

This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
December 2016	5
January 2017	1
February 2017	5
April 2017	3
May 2017	2
June 2017	1
July 2017	2
August 2017	4
October 2017	2
November 2017	9
December 2017	57
January 2018	4
February 2018	3
March 2018	9
April 2018	10
May 2018	9
June 2018	12
July 2018	11
August 2018	8
September 2018	7
October 2018	1
November 2018	11
December 2018	5
January 2019	1
February 2019	13
March 2019	8
April 2019	10
May 2019	18
June 2019	12
July 2019	14
August 2019	16
September 2019	11
October 2019	5
November 2019	3
December 2019	8
January 2020	12
February 2020	13
March 2020	7
April 2020	4
May 2020	12
June 2020	6
July 2020	3
August 2020	7
September 2020	4
October 2020	6
November 2020	3
December 2020	9
January 2021	1
February 2021	4
March 2021	8
April 2021	7
May 2021	15
June 2021	8
July 2021	5
August 2021	6
September 2021	1
October 2021	8
November 2021	13
December 2021	3
January 2022	7
February 2022	5
March 2022	3
April 2022	5
May 2022	3
June 2022	1
July 2022	4
August 2022	12
September 2022	5
October 2022	5
November 2022	5
December 2022	2
January 2023	4
February 2023	1
March 2023	4
May 2023	17
June 2023	28
July 2023	34
August 2023	27
September 2023	25
October 2023	14
November 2023	13
December 2023	8
January 2024	25
February 2024	24
March 2024	6
April 2024	2
May 2024	6
June 2024	16
July 2024	8
August 2024	4
September 2024	16
October 2024	7
November 2024	7
December 2024	3
January 2025	5
February 2025	1
March 2025	6
April 2025	6
May 2025	8
June 2025	6
July 2025	6
August 2025	12
September 2025	7
October 2025	1
November 2025	13
December 2025	5
January 2026	2

Article Contents

TparvaDB: a database to support Theileria parva vaccine development

Abstract

Introduction

TparvaDB architecture

Theileria parva re-annotation

Future development of T. parva DB

Funding

References

Citations

Views

Altmetric

Citing articles via

Latest

Most Read

Most Cited

Article Contents

TparvaDB: a database to support Theileria parva vaccine development Open Access

Abstract

Introduction

TparvaDB architecture

Theileria parva re-annotation

Future development of T. parva DB

Funding

References

Citations

Views

Altmetric

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Gift article access

Gift article access

Gift article access

Gift article access

TparvaDB: a database to support Theileria parva vaccine development