Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus)

Thanh, Nguyen Minh, Jung, Hyungtaek, Lyons, Russell E., Njaci, Isaac, Yoon, Byoung-Ha, Chand, Vincent, Tuan, Nguyen Viet, Thu, Vo Thi Minh and Mather, Peter (2015) Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus). Marine Genomics, 23 87-97. doi:10.1016/j.margen.2015.05.001


Author Thanh, Nguyen Minh
Jung, Hyungtaek
Lyons, Russell E.
Njaci, Isaac
Yoon, Byoung-Ha
Chand, Vincent
Tuan, Nguyen Viet
Thu, Vo Thi Minh
Mather, Peter
Title Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus)
Formatted title
Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus)
Journal name Marine Genomics   Check publisher's open access policy
ISSN 1874-7787
1876-7478
Publication date 2015-05-12
Year available 2015
Sub-type Article (original research)
DOI 10.1016/j.margen.2015.05.001
Open Access Status Not yet assessed
Volume 23
Start page 87
End page 97
Total pages 11
Place of publication Amsterdam, Netherlands
Publisher Elsevier
Language eng
Subject 1104 Aquatic Science
1311 Genetics
Abstract Striped catfish ( Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97. bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478. bp and N50 length of 506. bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species.
Formatted abstract
Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97 bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478 bp and N50 length of 506 bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species.
Keyword Ion Torrent
Pangasianodon hypophthalmus
Salinity tolerance
Simple sequence repeat
Q-Index Code C1
Q-Index Status Confirmed Code
Grant ID 106.99-2011.63
Institutional Status UQ
Additional Notes Article in press corrected proof.

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2016 Collection
School of Veterinary Science Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 4 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 4 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 26 May 2015, 14:02:36 EST by System User on behalf of Scholarly Communication and Digitisation Service