A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches

Liquet, Benoit and Saracco, Jerome (2012) A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches. Computational Statistics, 27 1: 103-125. doi:10.1007/s00180-011-0241-9

Author Liquet, Benoit
Saracco, Jerome
Title A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches
Journal name Computational Statistics   Check publisher's open access policy
ISSN 0943-4062
Publication date 2012-03
Year available 2012
Sub-type Article (original research)
DOI 10.1007/s00180-011-0241-9
Volume 27
Issue 1
Start page 103
End page 125
Total pages 23
Place of publication Heidelberg, Germany
Publisher Physica-Verlag GmbH
Collection year 2013
Language eng
Formatted abstract
Sliced inverse regression (SIR) and related methods were introduced in order to reduce the dimensionality of regression problems. In general semiparametric regression framework, these methods determine linear combinations of a set of explanatory variables X related to the response variable Y, without losing information on the conditional distribution of Y given X. They are based on a "slicing step" in the population and sample versions. They are sensitive to the choice of the number H of slices, and this is particularly true for SIR-II and SAVE methods. At the moment there are no theoretical results nor practical techniques which allows the user to choose an appropriate number of slices. In this paper, we propose an approach based on the quality of the estimation of the effective dimension reduction (EDR) space: the square trace correlation between the true EDR space and its estimate can be used as goodness of estimation. We introduce a naïve bootstrap estimation of the square trace correlation criterion to allow selection of an "optimal" number of slices. Moreover, this criterion can also simultaneously select the corresponding suitable dimension K (number of the linear combination of X). From a practical point of view, the choice of these two parameters H and K is essential. We propose a 3D-graphical tool, implemented in R, which can be useful to select the suitable couple (H, K). An R package named "edrGraphicalTools" has been developed. In this article, we focus on the SIR-I, SIR-II and SAVE methods. Moreover the proposed criterion can be use to determine which method seems to be efficient to recover the EDR space, that is the structure between Y and X. We indicate how the proposed criterion can be used in practice. A simulation study is performed to illustrate the behavior of this approach and the need for selecting properly the number H of slices and the dimension K. A short real-data example is also provided.
Keyword Bootstrap
Dimension reduction
Sliced inverse regression (SIR)
Sliced average variance estimation (SAVE)
Square trace correlation
Average Variance Estimation
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: School of Mathematics and Physics
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 6 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 4 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 13 Sep 2013, 15:51:58 EST by Kay Mackie on behalf of School of Mathematics & Physics