Abstract

Fish, being a crucial component of aquatic ecosystems, holds significant importance from both economic and ecological perspectives. However, the identification of fish at the species level remains challenging, and there is a lack of a taxonomically complete and comprehensive reference sequence database for fish. Therefore, we developed CoSFISH, an online fish database. Currently, the database contains 21 535 cytochrome oxidase I sequences and 1074 18S rRNA sequences of 21 589 species, belonging to 8 classes and 90 orders. We additionally incorporate online analysis tools to aid users in comparing, aligning and analyzing sequences, as well as designing primers. Users can upload their own data for analysis, in addition to using the data stored in the database directly. CoSFISH offers an extensive fish database and incorporates online analysis tools, making it a valuable resource for the study of fish diversity, phylogenetics and biological evolution.

Database URL:  http://210.22.121.250:8888/CoSFISH/home/indexPage.

Introduction

Fish comprises over 50% of the total number of vertebrates globally, with an estimated 60 000 species (1), of which more than 35 600 species have been recorded (2). Due to their economic and ecological significance, fish have attracted a lot of attention, especially economic fish in the aquaculture industry and model fish utilized in scientific research. Accurate identification of fish species is essential for any fish-based studies. However, species-level classification of fish remains challenging. Morphological features can sometimes cause confusion in identification, as individuals of the same species exhibit morphological flexibility, including fish (3). Therefore, molecular markers have been developed to fill this gap. Among the wide variety of molecular markers, mitochondrial DNA markers are widely used in species delineation and phylogenetic analysis, due to their maternal inheritance, non-recombination and higher mutation rate than nuclear DNA markers (4). Among mitochondrial DNA markers, mitochondrial cytochrome oxidase I (COI) is most widely used for DNA barcoding (5). However, the resolution of mitochondrial DNA markers can be limited, particularly due to the common occurrence of fish hybrids, making them insufficient for species classification alone. Therefore, nuclear markers are essential for precise species identification and population genetic analyses. Combining mitochondrial molecular markers, such as Cytb, COI and 16S rRNA (5), and nuclear molecular markers, such as ITS1, ITS2, 18S rRNA and 28S rRNA (6), provides more comprehensive genetic insights. This enables greater information richness, broader field application and a deeper understanding of genetic diversity and evolution.

For fish, several databases have been developed, such as FishBase, Mitofish, FishDB, Fish-T1K, FishGET, FishSCT, FishSED and cnfishbase. FishBase (https://www.fishbase.se/home.php) is a comprehensive online database dedicated to global fish species, offering detailed information covering taxonomy, distribution, morphology, ecology, behavior and biology, accompanied by corresponding pictures. Mitofish (http://mitofish.aori.u-tokyo.ac.jp) is a database particularly for fish mitogenomes and provides online mitogenome assembly, annotation and phylogenetic analysis as well (7, 8). FishDB (http://fishdb.ihb.ac.cn) integrates fish genomes and transcriptomes and allows online comparative genomics analysis on orthologs (9). Fish-T1K (https://db.cngb.org/fisht1k), FishGET (http://bioinfo.ihb.ac.cn/fishget), FishSCT (http://bioinfo.ihb.ac.cn/fishsct) and FishSED (http://bioinfo.ihb.ac.cn/fishsed) are all specific for fish transcriptomes, enhancing our understanding of fish biology and evolution (10–13). cnfishbase (https://cnfishbase.cn) specially collects data of Chinese fishes, including their taxonomy and distribution visualization (14). However, the creation of a taxonomically exhaustive and comprehensive reference sequence database for fish is of utmost importance to facilitate the widespread implementation of DNA barcoding and metabarcoding methodologies in the monitoring and evaluation of aquatic ecosystems. Compiling regional data that facilitate large-scale fish diversity research continues to pose a significant challenge. Furthermore, the identification of fish species, especially hybrids, is currently difficult. In this study, we therefore developed the CoSFISH database, including COI and 18S rRNA sequences for recorded global fish species, to provide a more comprehensive view of fish diversity and phylogenetics.

Materials and methods

Data collection

As COI is the most extensively utilized mitochondrial DNA marker for fish, we incorporated COI barcodes within the CoSFISH database. We examined various nuclear markers for fish, such as 5.8S, ITS1, ITS2, 18S and 28S, and discovered that 18S rRNA was the most common marker among the fish species. As a result, our database specializes in using COI and 18S rRNA as the principle mitochondrial and nuclear markers, respectively. According to the taxonomic level of fish species in the book Fishes of the World (fifth edition) (1), we collected basic fish species information, such as species names, pictures, taxonomy and distribution information, from the FishBase website (https://www.fishbase.se/home.php) (2) and downloaded their COI and 18S rRNA sequences from NCBI (https://www.ncbi.nlm.nih.gov/) (15) correspondingly. The longest sequence was selected when one species possessed more than one COI/18S rRNA sequence in NCBI. However, it is worth noting that NCBI lacks precise taxonomic-level information for some sequences. Therefore, we cross-referenced these sequences with the bold database (http://www.barcodinglife.org) (16) to finalize the dataset.

Database construction

The CoSFISH database was deployed on the Ubuntu 20.04 operation system and developed using AKKA 2.13 (web server), MySQL 8.0.30 (database server), Scala 2.13.2 and SBT 1.3.9. All data in the database were managed and stored using the MySQL Database Management System. The query function was enforced based on Slick 3.3.2 middleware tier. The website interface components were designed and implemented using Bootstrap 4.6.0 and Play Framework 2.8.7. The website has been tested in multiple popular web browsers, including Firefox, Google Chrome and Internet Explorer, and is publicly available at http://210.22.121.250:8888/CoSFISH/home/indexPage.

Results

CoSFISH data content

A total of 21 589 fish species, belonging to 8 classes and 90 orders, were included in this study. A total of 21 535 COI sequences and 1074 18S rRNA sequences were mined from NCBI, respectively. Additionally, their species names, pictures, taxonomy and distribution information were correspondingly included based on FishBase. In terms of COI, the most abundant order is Perciformes, followed by Cypriniformes and Siluriformes (Table 1). Similarly, among the 18S rRNA sequences, the most frequently sequenced order is also Perciformes, followed by Syngnathiformes and Siluriformes (Table 1).

Table 1.

The number of COI and 18S rRNA sequences contained in the CoSFISH database

OrderNumber of 18S rRNA sequencesNumber of COI sequences
Acanthuriformes22320
Acipenseriformes2428
Acropomatiformes2130
Albuliformes111
Alepocephaliformes056
Amiiformes11
Amphioxiformes38
Anabantiformes18203
Anguilliformes23532
Argentiniformes249
Ateleopodiformes05
Atheriniformes18199
Aulopiformes3249
Batrachoidiformes333
Beloniformes12181
Beryciformes348
Blenniiformes10840
Caproiformes016
Carangiformes12186
Carcharhiniformes8200
Centrarchiformes29172
Ceratodontiformes36
Chaetodontiformes4192
Characiformes521595
Chimaeriformes138
Cichliformes56641
Clupeiformes19414
Coelacanthiformes11
Cypriniformes562090
Cyprinodontiformes19540
Echinorhiniformes12
Elopiformes29
Ephippiformes011
Esociformes115
Gadiformes12250
Galaxiiformes1931
Gerreiformes1108
Gobiiformes381584
Gonorynchiformes26
Gymnotiformes2164
Heterodontiformes04
Hexanchiformes14
Hiodontiformes12
Holocentriformes379
Istiophoriformes250
Kurtiformes4512
Labriformes13541
Lamniformes615
Lampriformes118
Lepidogalaxiiformes11
Lobotiformes12
Lophiiformes0138
Lutjaniformes7293
Moroniformes58
Mugiliformes10288
Myctophiformes1197
Myliobatiformes6177
Myxiniformes035
Notacanthiformes022
Ophidiiformes9190
Orectolobiformes723
Osmeriformes957
Osteoglossiformes7117
Ovalentaria4187
Pempheriformes06
Perciformes1422520
Percopsiformes07
Petromyzontiformes338
Pleuronectiformes43470
Polymixiiformes06
Polypteriformes26
Priacanthiformes137
Rajiformes3143
Rhinopristiformes125
Salmoniformes20140
Scombriformes22254
Semionotiformes28
Siluriformes841853
Spariformes31391
Squaliformes5125
Squatiniformes124
Stomiiformes0194
Stylephoriformes01
Synbranchiformes277
Syngnathiformes94571
Tetraodontiformes28341
Torpediniformes130
Trachichthyiformes124
Uranoscopiformes697
Zeiformes123
Total107421 535
OrderNumber of 18S rRNA sequencesNumber of COI sequences
Acanthuriformes22320
Acipenseriformes2428
Acropomatiformes2130
Albuliformes111
Alepocephaliformes056
Amiiformes11
Amphioxiformes38
Anabantiformes18203
Anguilliformes23532
Argentiniformes249
Ateleopodiformes05
Atheriniformes18199
Aulopiformes3249
Batrachoidiformes333
Beloniformes12181
Beryciformes348
Blenniiformes10840
Caproiformes016
Carangiformes12186
Carcharhiniformes8200
Centrarchiformes29172
Ceratodontiformes36
Chaetodontiformes4192
Characiformes521595
Chimaeriformes138
Cichliformes56641
Clupeiformes19414
Coelacanthiformes11
Cypriniformes562090
Cyprinodontiformes19540
Echinorhiniformes12
Elopiformes29
Ephippiformes011
Esociformes115
Gadiformes12250
Galaxiiformes1931
Gerreiformes1108
Gobiiformes381584
Gonorynchiformes26
Gymnotiformes2164
Heterodontiformes04
Hexanchiformes14
Hiodontiformes12
Holocentriformes379
Istiophoriformes250
Kurtiformes4512
Labriformes13541
Lamniformes615
Lampriformes118
Lepidogalaxiiformes11
Lobotiformes12
Lophiiformes0138
Lutjaniformes7293
Moroniformes58
Mugiliformes10288
Myctophiformes1197
Myliobatiformes6177
Myxiniformes035
Notacanthiformes022
Ophidiiformes9190
Orectolobiformes723
Osmeriformes957
Osteoglossiformes7117
Ovalentaria4187
Pempheriformes06
Perciformes1422520
Percopsiformes07
Petromyzontiformes338
Pleuronectiformes43470
Polymixiiformes06
Polypteriformes26
Priacanthiformes137
Rajiformes3143
Rhinopristiformes125
Salmoniformes20140
Scombriformes22254
Semionotiformes28
Siluriformes841853
Spariformes31391
Squaliformes5125
Squatiniformes124
Stomiiformes0194
Stylephoriformes01
Synbranchiformes277
Syngnathiformes94571
Tetraodontiformes28341
Torpediniformes130
Trachichthyiformes124
Uranoscopiformes697
Zeiformes123
Total107421 535
Table 1.

The number of COI and 18S rRNA sequences contained in the CoSFISH database

OrderNumber of 18S rRNA sequencesNumber of COI sequences
Acanthuriformes22320
Acipenseriformes2428
Acropomatiformes2130
Albuliformes111
Alepocephaliformes056
Amiiformes11
Amphioxiformes38
Anabantiformes18203
Anguilliformes23532
Argentiniformes249
Ateleopodiformes05
Atheriniformes18199
Aulopiformes3249
Batrachoidiformes333
Beloniformes12181
Beryciformes348
Blenniiformes10840
Caproiformes016
Carangiformes12186
Carcharhiniformes8200
Centrarchiformes29172
Ceratodontiformes36
Chaetodontiformes4192
Characiformes521595
Chimaeriformes138
Cichliformes56641
Clupeiformes19414
Coelacanthiformes11
Cypriniformes562090
Cyprinodontiformes19540
Echinorhiniformes12
Elopiformes29
Ephippiformes011
Esociformes115
Gadiformes12250
Galaxiiformes1931
Gerreiformes1108
Gobiiformes381584
Gonorynchiformes26
Gymnotiformes2164
Heterodontiformes04
Hexanchiformes14
Hiodontiformes12
Holocentriformes379
Istiophoriformes250
Kurtiformes4512
Labriformes13541
Lamniformes615
Lampriformes118
Lepidogalaxiiformes11
Lobotiformes12
Lophiiformes0138
Lutjaniformes7293
Moroniformes58
Mugiliformes10288
Myctophiformes1197
Myliobatiformes6177
Myxiniformes035
Notacanthiformes022
Ophidiiformes9190
Orectolobiformes723
Osmeriformes957
Osteoglossiformes7117
Ovalentaria4187
Pempheriformes06
Perciformes1422520
Percopsiformes07
Petromyzontiformes338
Pleuronectiformes43470
Polymixiiformes06
Polypteriformes26
Priacanthiformes137
Rajiformes3143
Rhinopristiformes125
Salmoniformes20140
Scombriformes22254
Semionotiformes28
Siluriformes841853
Spariformes31391
Squaliformes5125
Squatiniformes124
Stomiiformes0194
Stylephoriformes01
Synbranchiformes277
Syngnathiformes94571
Tetraodontiformes28341
Torpediniformes130
Trachichthyiformes124
Uranoscopiformes697
Zeiformes123
Total107421 535
OrderNumber of 18S rRNA sequencesNumber of COI sequences
Acanthuriformes22320
Acipenseriformes2428
Acropomatiformes2130
Albuliformes111
Alepocephaliformes056
Amiiformes11
Amphioxiformes38
Anabantiformes18203
Anguilliformes23532
Argentiniformes249
Ateleopodiformes05
Atheriniformes18199
Aulopiformes3249
Batrachoidiformes333
Beloniformes12181
Beryciformes348
Blenniiformes10840
Caproiformes016
Carangiformes12186
Carcharhiniformes8200
Centrarchiformes29172
Ceratodontiformes36
Chaetodontiformes4192
Characiformes521595
Chimaeriformes138
Cichliformes56641
Clupeiformes19414
Coelacanthiformes11
Cypriniformes562090
Cyprinodontiformes19540
Echinorhiniformes12
Elopiformes29
Ephippiformes011
Esociformes115
Gadiformes12250
Galaxiiformes1931
Gerreiformes1108
Gobiiformes381584
Gonorynchiformes26
Gymnotiformes2164
Heterodontiformes04
Hexanchiformes14
Hiodontiformes12
Holocentriformes379
Istiophoriformes250
Kurtiformes4512
Labriformes13541
Lamniformes615
Lampriformes118
Lepidogalaxiiformes11
Lobotiformes12
Lophiiformes0138
Lutjaniformes7293
Moroniformes58
Mugiliformes10288
Myctophiformes1197
Myliobatiformes6177
Myxiniformes035
Notacanthiformes022
Ophidiiformes9190
Orectolobiformes723
Osmeriformes957
Osteoglossiformes7117
Ovalentaria4187
Pempheriformes06
Perciformes1422520
Percopsiformes07
Petromyzontiformes338
Pleuronectiformes43470
Polymixiiformes06
Polypteriformes26
Priacanthiformes137
Rajiformes3143
Rhinopristiformes125
Salmoniformes20140
Scombriformes22254
Semionotiformes28
Siluriformes841853
Spariformes31391
Squaliformes5125
Squatiniformes124
Stomiiformes0194
Stylephoriformes01
Synbranchiformes277
Syngnathiformes94571
Tetraodontiformes28341
Torpediniformes130
Trachichthyiformes124
Uranoscopiformes697
Zeiformes123
Total107421 535

Database homepage

The CoSFISH database provides a user-friendly web interface to access and analyze the data. The homepage of the database is divided into two sections: a navigation bar at the top and the main information below the navigation bar. The navigation bar consists of six core functions: ‘Home’, ‘18S’, ‘COI’, ‘BLAST’, ‘Tools’ and ‘Download’. Above the navigation bar is the name and logo of the database. The main information includes the introduction of the database, some representative fish pictures, quick search box, data statistics, release notes and global visitors (Figure 1).

Home page of the CoSFISH database.
Figure 1.

Home page of the CoSFISH database.

Search and browse

Search functions can be reached by several different approaches, such as quick search, advanced search and BLAST. The home page search box provides users with the ability to perform quick searches by taxonomic level, accession number or species name. Subsequently, CoSFISH generates a list of search results that correspond to the provided input information. Each result in the list includes a hyperlink to the detailed page of the sequence and respective species (Figure 2A). Each sequence page includes basic information (Figure 2B), geographic distribution description (Figure 2C), references of each sequence (Figure 2D) and taxonomic information (Figure 2E) of the corresponding species. Additionally, we also provide the source link of FishBase and NCBI of each sequence and species, shown in Figure 2B.

Quick search page (A) and the detail page of each sequence (B–E). (A) Example of the quick search result. (B) Basic information of each sequence and its corresponding species. (C) Geographic distribution information of the species corresponding to the sequence. (D) References of the sequence. (E) Taxonomic information of the species corresponding to the sequence.
Figure 2.

Quick search page (A) and the detail page of each sequence (B–E). (A) Example of the quick search result. (B) Basic information of each sequence and its corresponding species. (C) Geographic distribution information of the species corresponding to the sequence. (D) References of the sequence. (E) Taxonomic information of the species corresponding to the sequence.

(Continued)
Figure 2.

(Continued)

Through the ‘18S’ and ‘COI’ function in the navigation bar, users can realize the advanced search. Each of the ‘18S’ and ‘COI’ page includes two parts: browse and taxonomy. Users can access all the ‘18S’ and ‘COI’ data through the browse function, which enables users to selectively display the desired information, like Accession number, Species, Phylum, Class, Order, Family, Genus and Length (Figure 3A). Simultaneously, the function of taxonomy offers a more flexible way to find specific information. Users can click the taxonomy to choose the interested Phylum, Class, Order, Family, Genus and Species, effectively obtaining the target information. Particularly, for each taxonomic level, the corresponding sequence number is displayed on the right, as well as the corresponding result page (Figure 3B). Besides, we have incorporated the BLAST function into our database to facilitate the comparison of all DNA sequences, resulting in a descending similarity result list. Users can establish parameters to regulate the level of search sensitivity and determine the format of the search results (Figure 3C).

Advanced search function. (A) Using ‘browse’ function under ‘18S’ and ‘COI’ function in the navigation bar to view the available sequences. (B) Through ‘taxonomy’ function under ‘18S’ and ‘COI’ function in the navigation bar to find the desired sequences. (C) BLAST page and an example of the BLAST search result.
Figure 3.

Advanced search function. (A) Using ‘browse’ function under ‘18S’ and ‘COI’ function in the navigation bar to view the available sequences. (B) Through ‘taxonomy’ function under ‘18S’ and ‘COI’ function in the navigation bar to find the desired sequences. (C) BLAST page and an example of the BLAST search result.

(Continued)
Figure 3.

(Continued)

Online analysis tools

Several online analysis tools are available in the CoSFISH database, including BLAST (17), Muscle (18), Genewise (19), Lastz (20) and Primer3 (21). Given the complexity and time-consuming process of phylogenetic analysis, we also provide a link to CIPRES (22), a professional online platform dedicated to conducting phylogenetic analyses. Users have the option to either upload their own data in FASTA or TXT format or utilize the data available in the CoSFISH database. These tools can help users compare, align and analyze DNA sequences and design primers as well. Upon completion of the analysis, an automated result page will be presented, along with downloadable result files for users to access.

Download

All the COI and 18S rRNA sequences contained within the CoSFISH database are readily accessible and freely downloadable in FASTA or TXT format. Specifically, users can selectively download DNA sequences by marker type, Phylum, Class and Order, based on their academic objectives (Figure 4). For analysis results obtained through online tools, we also offer the option to download the result files in TXT, Newick or SVG formats. In addition, MD5 (Message Digest Algorithm 5) checksums are provided to verify the integrity of the downloaded data.

Download page and an example of the download result.
Figure 4.

Download page and an example of the download result.

Conclusion

To provide a taxonomically complete and comprehensive reference sequence database for fish, we have developed the CoSFISH database, which serves as a valuable resource for the exhaustive study of fish diversity, phylogenetics and biological evolution. CoSFISH stores 21 535 COI sequences and 1074 18S rRNA sequences of 21 589 fish species and provides online analysis tools to compare sequence similarity, conduct phylogenetic analysis and design primers. We will continuously update the CoSFISH database once COI and 18S rRNA sequences of new fish species are released. In addition, we will incorporate a wider range of bioinformatic analysis tools into CoSFISH to enhance its functionality, facilitating the creation of a more comprehensive database for sharing, integrating and utilizing fish genetic resources.

Data availability

All data in this study can be accessed through the web server at http://210.22.121.250:8888/CoSFISH/home/indexPage.

Funding

China-ASEAN Maritime Cooperation Fund (no. CAMC-2018F), Guangdong Rural Revitalization Strategy Special Provincial Organization and Implementation Project Funds (no. 2022SBH00001), Guangdong Provincial Special Fund for Modern Agriculture Industry Technology Innovation Team (no. 2023KJ150 and 2023KJ134), National Freshwater Genetic Resource Center (no. FGRC18537).

Conflict of interest statement

There is no conflict of interest.

References

1.

Nelson
J.S.
,
Grande
T.C.
and
Wilson
M.V.
(
2016
)
Fishes of the World
. 5th edn.
John Wiley & Sons
,
Hoboken
.

2.

Froese
R.
and
Pauly
D.
(eds.) (
2024
)
FishBase
.
World Wide Web electronic publication
. www.fishbase.org, version (02/2024).

3.

Piersma
T.
and
Drent
J.
(
2003
)
Phenotypic flexibility and the evolution of organismal design
.
Trends Ecol. Evol.
,
18
,
228
233
.

4.

Galtier
N.
,
Nabholz
B.
,
Glémin
S.
 et al.  (
2009
)
Mitochondrial DNA as a marker of molecular diversity: a reappraisal
.
Mol. Ecol.
,
18
,
4541
4550
.

5.

Hebert
P.D.
,
Cywinska
A.
,
Ball
S.L.
 et al.  (
2003
)
Biological identifications through DNA barcodes
.
Proc. R. Soc. B
,
270
,
313
321
.

6.

Krehenwinkel
H.
,
Pomerantz
A.
,
Henderson
J.B.
 et al.  (
2019
)
Nanopore sequencing of long ribosomal DNA amplicons enables portable and simple biodiversity assessments with high phylogenetic resolution across broad taxonomic scale
.
GigaScience
,
8
,
1
16
.

7.

Sato
Y.
,
Miya
M.
,
Fukunaga
T.
 et al.  (
2018
)
MitoFish and MiFish pipeline: a mitochondrial genome database of fish with an analysis pipeline for environmental DNA metabarcoding
.
Mol. Biol. Evol.
,
35
,
1553
1555
.

8.

Zhu
T.
,
Sato
Y.
,
Sado
T.
 et al.  (
2023
)
MitoFish, MitoAnnotator, and MiFish Pipeline: updates in 10 years
.
Mol. Biol. Evol.
,
40
,
1
5
.

9.

Yang
L.
,
Xu
Z.
,
Zeng
H.
 et al.  (
2020
)
FishDB: an integrated functional genomics database for fishes
.
BMC Genom.
,
21
,
1
5
.

10.

Sun
Y.
,
Huang
Y.
,
Li
X.
 et al.  (
2016
)
Fish-T1K (Transcriptomes of 1,000 Fishes) project: large-scale transcriptome data for fish evolution studies
.
Gigascience
,
5
,
s13742
016
.

11.

Guo
C.
,
Duan
Y.
,
Ye
W.
 et al.  (
2023
)
FishGET: a fish gene expression and transcriptome database with improved accuracy and visualization
.
Iscience
,
26
,
1
15
.

12.

Guo
C.
,
Ye
W.
,
Shi
M.
 et al.  (
2023
)
FishSCT: a zebrafish-centric database for exploration and visualization of fish single-cell transcriptome
.
Sci. China Life Sci.
,
66
,
2185
2188
.

13.

Guo
C.
,
Ye
W.
,
Cao
D.
 et al.  (
2024
)
Unraveling the stereoscopic gene transcriptional landscape of zebrafish using FishSED, a fish spatial expression database with multispecies scalability
.
Sci. China Life Sci.
,
67
,
843
846
.

14.

Lu
Y.R.
,
Fang
C.C.
and
He
S.P.
(
2023
)
cnfishbase: a cyber Chinese fish database
.
Zool. Res.
,
44
,
950
953
.

15.

Sayers
E.W.
,
Beck
J.
,
Bolton
E.E.
 et al.  (
2024
)
Database resources of the national center for biotechnology information
.
Nucleic Acids Res.
,
52
,
D33
D43
.

16.

Ratnasingham
S.
and
Hebert
P.D.
(
2007
)
BOLD: the barcode of life data system (http://www.barcodinglife.org)
.
Mol. Ecol. Notes
,
7
,
355
364
.

17.

Camacho
C.
,
Boratyn
G.M.
,
Joukov
V.
 et al.  (
2023
)
ElasticBLAST: accelerating sequence search via cloud computing
.
BMC Bioinform.
,
24
,
1
16
.

18.

Edgar
R.C.
(
2022
)
Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny
.
Nat Commun.
 
13
,
1
9
.

19.

Birney
E.
and
Durbin
R.
(
2000
)
Using GeneWise in the Drosophila annotation experiment
.
Genome Res.
,
10
,
547
548
.

20.

Harris
R.S.
(
2007
)
Improved Pairwise Alignment of Genomic DNA
.
Pennsylvania State University
,
University Park, PA
.

21.

Kõressaar
T.
,
Lepamets
M.
,
Kaplinski
L.
 et al.  (
2018
)
Primer3_masker: integrating masking of template sequence with primer design software
.
Bioinformatics
,
34
,
1937
1938
.

22.

Miller
M.A.
,
Pfeiffer
W.
and
Schwartz
T.
(
2011
)
The CIPRES science gateway: a community resource for phylogenetic analyses
. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery,
Salt Lake City, UT
. pp.
1
8
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.