On selection biases with prediction rules formed from gene expression data

Zhu, J. X., McLachlan, G. J., Jones, L. B. T. and Wood, I. A. (2008) On selection biases with prediction rules formed from gene expression data. Journal of Statistical Planning and Inference, 138 2: 374-386. doi:10.1016/j.jspi.2007.06.003

Author Zhu, J. X.
McLachlan, G. J.
Jones, L. B. T.
Wood, I. A.
Title On selection biases with prediction rules formed from gene expression data
Journal name Journal of Statistical Planning and Inference   Check publisher's open access policy
ISSN 0378-3758
Publication date 2008-01-01
Year available 2007
Sub-type Article (original research)
DOI 10.1016/j.jspi.2007.06.003
Open Access Status Not yet assessed
Volume 138
Issue 2
Start page 374
End page 386
Total pages 13
Editor Balakrishnan, N.
Place of publication Netherlands
Publisher Elsevier
Language eng
Subject 230204 Applied Statistics
270201 Gene Expression
780101 Mathematical sciences
730305 Diagnostic methods
Abstract There has been ever increasing interest in the use of microarray experiments as a basis for the provision of prediction (discriminant) rules for improved diagnosis of cancer and other diseases. Typically, the microarray cancer studies provide only a limited number of tissue samples from the specified classes of tumours or patients, whereas each tissue sample may contain the expression levels of thousands of genes. Thus researchers are faced with the problem of forming a prediction rule on the basis of a small number of classified tissue samples, which are of very high dimension. Usually, some form of feature (gene) selection is adopted in the formation of the prediction rule. As the subset of genes used in the final form of the rule have not been randomly selected but rather chosen according to some criterion designed to reflect the predictive power of the rule, there will be a selection bias inherent in estimates of the error rates of the rules if care is not taken. We shall present various situations where selection bias arises in the formation of a prediction rule and where there is a consequent need for the correction of this bias. We describe the design of cross-validation schemes that are able to correct for the various selection biases. (C) 2007 Elsevier B.V. All rights reserved.
Keyword Statistics & Probability
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Available online 12 June 2007 and published in journal 2008.

Document type: Journal Article
Sub-type: Article (original research)
Collection: 2008 Higher Education Research Data Collection
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 11 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 10 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Wed, 09 Apr 2008, 21:21:02 EST by Marie Grove on behalf of School of Mathematics & Physics