An empirical study of the sample size variability of optimal active learning using Gaussian process regression

Yeh, F.Y-H. and Gallagher, M. (2008). An empirical study of the sample size variability of optimal active learning using Gaussian process regression. In: Hou, Z-G. and Zhang, N., Proceedings of the IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008. IEEE World Congress on Computational Intelligence, Hong Kong, (3787-3794). 1-6 June 2008. doi:10.1109/IJCNN.2008.4634342


Author Yeh, F.Y-H.
Gallagher, M.
Title of paper An empirical study of the sample size variability of optimal active learning using Gaussian process regression
Conference name IEEE World Congress on Computational Intelligence
Conference location Hong Kong
Conference dates 1-6 June 2008
Convener Wang, J.
Proceedings title Proceedings of the IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008
Journal name 2008 Ieee International Joint Conference On Neural Networks, Vols 1-8
Place of Publication Piscataway NJ USA
Publisher IEEE
Publication Year 2008
Year available 2008
Sub-type Fully published paper
DOI 10.1109/IJCNN.2008.4634342
Open Access Status
ISBN 978-1-4244-1820-6
ISSN 1098-7576
Editor Hou, Z-G.
Zhang, N.
Start page 3787
End page 3794
Total pages 8
Language eng
Abstract/Summary Optimal active learning refers to a framework where the learner actively selects data points to be added to its training set in a statistically optimal way. Under the assumption of log-loss, optimal active learning can be implemented in a relatively simple and efficient manner for regression problems using Gaussian processes. However (to date), there has been little attempt to study the experimental behavior and performance of this technique. In this paper, we present a detailed empirical evaluation of optimal active learning using Gaussian processes across a set of seven regression problems from the DELVE repository. In particular, we examine the evaluation of optimal active learning compared to making random queries and the impact of experimental factors such as the size and construction of the different sub-datasets used as part of training and testing the models. It is shown that the multiple sources of variability can be quite significant and suggests that more care needs to be taken in the evaluation of active learning algorithms.
Subjects E1
890299 Computer Software and Services not elsewhere classified
080108 Neural, Evolutionary and Fuzzy Computation
Keyword Computer Science, Artificial Intelligence
Computer Science, Cybernetics
Engineering, Electrical & Electronic
Computer Science
Engineering
Q-Index Code E1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Conference Paper
Sub-type: Fully published paper
Collections: 2009 Higher Education Research Data Collection
School of Information Technology and Electrical Engineering Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Fri, 17 Apr 2009, 20:50:20 EST by Ms Kimberley Nunes on behalf of School of Information Technol and Elec Engineering