On the simultaneous use of clinical and microarray expression data in the cluster analysis of tissue samples

McLachlan, G. J., Chang, S., Mar, J. and Ambroise, C. (2004). On the simultaneous use of clinical and microarray expression data in the cluster analysis of tissue samples. In: Yi-Ping Phoebe Chen, Proceedings of the Second Asia-Pacific Bioinformatics Conference (APBC2004). Second Asia-Pacific Bioinformatics Conference, Dunedin, New Zealand, (167-171). 18-22 January 2004.


Author McLachlan, G. J.
Chang, S.
Mar, J.
Ambroise, C.
Title of paper On the simultaneous use of clinical and microarray expression data in the cluster analysis of tissue samples
Conference name Second Asia-Pacific Bioinformatics Conference
Conference location Dunedin, New Zealand
Conference dates 18-22 January 2004
Proceedings title Proceedings of the Second Asia-Pacific Bioinformatics Conference (APBC2004)   Check publisher's open access policy
Place of Publication Sydney, Australia
Publisher Australian Computer Society
Publication Year 2004
Sub-type Fully published paper
ISBN 2-920682-11-2
ISSN 1445-1335
Editor Yi-Ping Phoebe Chen
Volume 29
Start page 167
End page 171
Total pages 5
Collection year 2004
Language eng
Abstract/Summary This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).
Subjects E1
230204 Applied Statistics
780101 Mathematical sciences
Keyword Microarrays
Gene expressions
Mixture modelling
Cluster analysis
Clinical data
Q-Index Code E1
Additional Notes Proceedings series title: Conferences in Research and Practice in Information Technology

 
Versions
Version Filter Type
Citation counts: Google Scholar Search Google Scholar
Created: Thu, 23 Aug 2007, 19:12:32 EST