mixOmics: An R package for 'omics feature selection and multiple data integration

Rohart, Florian, Gautier, Benoît, Singh, Amrit and Lê Cao, Kim-Anh (2017) mixOmics: An R package for 'omics feature selection and multiple data integration. PLoS computational biology, 13 11: e1005752. doi:10.1371/journal.pcbi.1005752

Author Rohart, Florian
Gautier, Benoît
Singh, Amrit
Lê Cao, Kim-Anh
Title mixOmics: An R package for 'omics feature selection and multiple data integration
Journal name PLoS computational biology   Check publisher's open access policy
ISSN 1553-7358
Publication date 2017-11-03
Year available 2017
Sub-type Article (original research)
DOI 10.1371/journal.pcbi.1005752
Open Access Status DOI
Volume 13
Issue 11
Start page e1005752
Total pages 19
Place of publication SAN FRANCISCO
Publisher Public Library of Science
Language eng
Subject 1105 Ecology, Evolution, Behavior and Systematics
2611 Modelling and Simulation
2303 Ecology
1312 Molecular Biology
1311 Genetics
2804 Cellular and Molecular Neuroscience
1703 Computational Theory and Mathematics
Abstract The advent of high throughput technologies has led to a wealth of publicly available 'omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a 'molecular signature') to explain or predict biological conditions, but mainly for a single type of 'omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous 'omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple 'omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of 'omics data available from the package.
Keyword Partial Least-Squares
Canonical Correlation-Analysis
Q-Index Code C1
Q-Index Status Provisional Code
Grant ID APP1087415
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: UQ Diamantina Institute Publications
Admin Only - UQ Diamantina Institute
Pubmed Import
Version Filter Type
Citation counts: Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Wed, 08 Nov 2017, 12:01:20 EST