A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data

Fan, Wenhong, Khalid, Najma, Hallahan, Andrew R., Olsen, James M. and Zhao, Lue Ping (2006) A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data. Theoretical Biology and Medical Modelling, 3 19: . doi:10.1186/1742-4682-3-19

Author Fan, Wenhong
Khalid, Najma
Hallahan, Andrew R.
Olsen, James M.
Zhao, Lue Ping
Title A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data
Journal name Theoretical Biology and Medical Modelling   Check publisher's open access policy
ISSN 1742-4682
Publication date 2006-04-07
Sub-type Article (original research)
DOI 10.1186/1742-4682-3-19
Open Access Status DOI
Volume 3
Issue 19
Total pages 9
Place of publication London , U.K.
Publisher BioMed Central
Language eng
Subject 11 Medical and Health Sciences
1117 Public Health and Health Services
Formatted abstract
Background: Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip® that uses multiple oligonucleotide probes (i.e. probe set), since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip® was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip® gene expression array data.
Results: We developed a two-step approach to predict alternative splicing from GeneChip® data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip® Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic) samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic medulloblastomas to non-metastatic ones. We checked the consistency of some of our findings with information in UCSC Human Genome Browser.
Conclusion: The two-step approach described in this paper is capable of predicting some alternative splicing from multiple oligonucleotide-based gene expression array data with GeneChip® technology. Our method employs the extensive repositories of gene expression array data available and generates alternative splicing hypotheses, which can be further validated by experimental studies.
Keyword Affymetrix GeneChip®
Oligonucleotide probes
Q-Index Code C1

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 8 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 9 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 24 Mar 2009, 12:26:14 EST by Ms Julie Schofield on behalf of Faculty Of Health Sciences