A transcriptional sketch of a primary human breast cancer by 454 deep sequencing

Guffanti, Alessandro, Iacono, Michele, Pelucchi, Paride, Kim, Namshin, Solda, Giulia, Croft, Larry J., Taft, Ryan J., Rizzi, Ermanno, Askarian-Amiri, Marjan, Bonnal, Raoul J., Callari, Maurizio, Mignone, Flavio, Pesole, Graziano, Bertalot, Giovanni, Bernardi, Luigi Rossi, Albertini, Alberto, Lee, Christopher, Mattick, John S., Zucchi, Ileana and De Bellis, Gianluca (2009) A transcriptional sketch of a primary human breast cancer by 454 deep sequencing. BMC Genomics, 10 163.1-163.17. doi:10.1186/1471-2164-10-163


Author Guffanti, Alessandro
Iacono, Michele
Pelucchi, Paride
Kim, Namshin
Solda, Giulia
Croft, Larry J.
Taft, Ryan J.
Rizzi, Ermanno
Askarian-Amiri, Marjan
Bonnal, Raoul J.
Callari, Maurizio
Mignone, Flavio
Pesole, Graziano
Bertalot, Giovanni
Bernardi, Luigi Rossi
Albertini, Alberto
Lee, Christopher
Mattick, John S.
Zucchi, Ileana
De Bellis, Gianluca
Title A transcriptional sketch of a primary human breast cancer by 454 deep sequencing
Journal name BMC Genomics   Check publisher's open access policy
ISSN 1471-2164
Publication date 2009-04-01
Sub-type Article (original research)
DOI 10.1186/1471-2164-10-163
Open Access Status DOI
Volume 10
Start page 163.1
End page 163.17
Total pages 17
Editor Dr Melissa Norton
Place of publication London, United Kingdom
Publisher BioMed Central
Language eng
Subject C1
9201 Clinical Health (Organs, Diseases and Abnormal Conditions)
920102 Cancer and Related Disorders
1107 Immunology
110799 Immunology not elsewhere classified
Abstract Background: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. Results: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Conclusion: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.
Formatted abstract
Background: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts.

Results: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas.

Conclusion:
Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.
Keyword Long noncoding RNAs
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Article number 163

Document type: Journal Article
Sub-type: Article (original research)
Collections: 2010 Higher Education Research Data Collection
ERA 2012 Admin Only
Institute for Molecular Bioscience - Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 138 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 153 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Thu, 03 Sep 2009, 18:00:48 EST by Mr Andrew Martlew on behalf of Institute for Molecular Bioscience