Group and sparse group partial least square approaches applied in genomics context

Liquet, Benoit, de Micheaux, Pierre Lafaye, Hejblum, Boris P. and Thiebaut, Rodolphe (2016) Group and sparse group partial least square approaches applied in genomics context. Bioinformatics, 32 1: 35-42. doi:10.1093/bioinformatics/btv535


Author Liquet, Benoit
de Micheaux, Pierre Lafaye
Hejblum, Boris P.
Thiebaut, Rodolphe
Title Group and sparse group partial least square approaches applied in genomics context
Journal name Bioinformatics   Check publisher's open access policy
ISSN 1367-4803
1367-4811
Publication date 2016-01-01
Year available 2015
Sub-type Article (original research)
DOI 10.1093/bioinformatics/btv535
Open Access Status Not Open Access
Volume 32
Issue 1
Start page 35
End page 42
Total pages 8
Place of publication Oxford, United Kingdom
Publisher Oxford University Press
Language eng
Formatted abstract
Motivation: The association between two blocks of ‘omics’ data brings challenging issues in computational biology due to their size and complexity. Here, we focus on a class of multivariate statistical methods called partial least square (PLS). Sparse version of PLS (sPLS) operates integration of two datasets while simultaneously selecting the contributing variables. However, these methods do not take into account the important structural or group effects due to the relationship between markers among biological pathways. Hence, considering the predefined groups of markers (e.g. genesets), this could improve the relevance and the efficacy of the PLS approach.
Results: We propose two PLS extensions called group PLS (gPLS) and sparse gPLS (sgPLS). Our algorithm enables to study the relationship between two different types of omics data (e.g. SNP and gene expression) or between an omics dataset and multivariate phenotypes (e.g. cytokine secretion). We demonstrate the good performance of gPLS and sgPLS compared with the sPLS in the context of grouped data. Then, these methods are compared through an HIV therapeutic vaccine trial. Our approaches provide parsimonious models to reveal the relationship between gene abundance and the immunological response to the vaccine.
Availability and implementation: The approach is implemented in a comprehensive R package called sgPLS available on the CRAN.
Contact: b.liquet@uq.edu.au
Supplementary information: Supplementary data are available at Bioinformatics online.
Keyword Canonical correlation-analysis
Group lasso
Matrix decomposition
Microbiome data
Regression
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status UQ
Additional Notes Published online 10 September 2015

Document type: Journal Article
Sub-type: Article (original research)
Collections: School of Mathematics and Physics
Official 2016 Collection
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 2 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 14 Feb 2016, 10:24:57 EST by System User on behalf of Learning and Research Services (UQ Library)