YuGene: A simple approach to scale gene expression data derived from different platforms for integrated analyses

Le Cao, Kim-Anh, Rohart, Florian, McHugh, Leo, Korn, Othmar and Wells, Christine A (2014) YuGene: A simple approach to scale gene expression data derived from different platforms for integrated analyses. Genomics, 103 4: 239-251. doi:10.1016/j.ygeno.2014.03.001

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
UQ331720_OA.pdf Full text (open access) application/pdf 2.97MB 0

Related Publications and Datasets
Author Le Cao, Kim-Anh
Rohart, Florian
McHugh, Leo
Korn, Othmar
Wells, Christine A
Title YuGene: A simple approach to scale gene expression data derived from different platforms for integrated analyses
Journal name Genomics   Check publisher's open access policy
ISSN 1089-8646
Publication date 2014-04-01
Year available 2014
Sub-type Article (original research)
DOI 10.1016/j.ygeno.2014.03.001
Open Access Status File (Author Post-print)
Volume 103
Issue 4
Start page 239
End page 251
Total pages 13
Place of publication Maryland Heights, United States
Publisher Academic Press
Language eng
Formatted abstract
Gene expression databases contain invaluable information about a range of cell states, but the question "Where is my gene of interest expressed?" remains one of the most difficult to systematically assess when relevant data is derived on different platforms. Barriers to integrating this data include disparities in data formats and scale, a lack of common identifiers, and the disproportionate contribution of a platform to the 'batch effect'. There are few purpose-built cross-platform normalization strategies, and most of these fit data to an idealized data structure, which in turn may compromise gene expression comparisons between different platforms. YuGene addresses this gap by providing a simple transform that assigns a modified cumulative proportion value to each measurement, without losing essential underlying information on data distributions or experimental correlates. The Yugene transform is applied to individual samples and is suitable to apply to data with different distributions. Yugene is robust to combining datasets of different sizes, does not require global renormalization as new data is added, and does not require a common identifier. YuGene was benchmarked against commonly used normalization approaches, performing favorably in comparison to quantile (RMA), Z-score or rank methods. Implementation in the www.stemformatics.org resource provides users with expression queries across stem cell related datasets. Probe performance statistics including poorly performing (never expressed) probes, and examples of probes/genes expressed in a sample-restricted manner are provided. The YuGene software is implemented as an R package available from CRAN.
Keyword Cross platform normalization
Gene expression
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 12 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 03 Jun 2014, 11:40:03 EST by System User on behalf of Aust Institute for Bioengineering & Nanotechnology