As standard algorithms for training support vector machines generally produce solutions containing a greater number of support vectors than strictly necessary, a method of reducing this set is examined. This method is then extended to the training set, decreasing the number of examples necessary to train the support vector machine.

It is found that linear dependence plays a crucial role in determining which of the support vectors may be discarded from the support vector set. After determining the vectors that are linearly dependent on other vectors in the support vector set, the support vectors *X*_{k} that satisfy

*NB: Original equation could not be reproduced from the thesis.

may be discarded while leaving the solution otherwise unchanged. This method was implemented on several benchmark datasets from the UCI repository [1] (Contraceptive Method Choice, Diabetes, Haberman and Wisconsin Breast Cancer) and the SVMLight Regression dataset [2]. The results demonstrated that in most cases our procedure lead to a reduction in the number of vectors in the support vector set.

While this reduces the number of examples required in the support vector set, it is still necessary to train the support vector machine using the entire training set. As the size of the quadratic programming problem depends solely on the number of training examples, for problems where the training set contains a high number of training examples, constructing the matrix required for the quadratic programming problem becomes computationally expensive. As most of the time is spent in solving the quadratic programming problem, by decreasing the number of training examples required this will decrease the time needed to solve the quadratic programming problem as well as the storage required.

Discarding examples that satisfy (1) from the training set may result in a reduced set on which to train the support vector machine. Four methods of determining linear dependence were examined. These involved pivoting the kernel matrix according to the 1-norm, 2-norm, infinity-norm and the row-reduced echelon form. Implementation of this method was applied to the benchmark datasets from [1] (Diabetes, Breast Cancer, Wisonsin Breast Cancer and Ionosphere) using the polynomial kernel, and the Regression dataset [2] with RBF kernel.

None of the selected methods used for determining linear dependence consistently over all datasets resulted in a higher accuracy than the SMO [2] and SVMLight [3] methods. Our method gave the same or higher accuracy as the SMO method in 81.25% (row-reduced echelon form, infinity-norm and 2-norm) and 75.0% (1-norm) of the cases. For the regression case the method resulted in the same or higher accuracy than the SVMLight [3] method for 80.0% of the cases.

This Thesis presents a method that can be used to decrease the number of examples required to train the support vector machine. While the linearly dependent training examples satisfying (1) may be discarded before training, the selection of the linearly dependent training examples is vital for the resulting accuracy.

[1] C.J. Merz and P.M. Murphy, "UCI repository of machine learning databases",

__http://www. ics. uci. edu/~mlearn/MLRepositorv. html__, Irving, CA : University of

California, Department of Information and Computer Science, 1998.

[2] J.C. Platt, "Fast Training of Support Vector Machines Using Sequential Minimum

Optimization", in Schölkopf, Burges and Smola, Eds., *Advances in Kernel Methods *

- Support Vector Learning, The MIT Press, Cambridge MA, 1998, pp 169-184.

[3] Ronan Collobert and Samy Bengio, "Support Vector Machines for Large-Scale

Regression Problems", *http://www. idiap. ch/learning/SVMTorch. html* IDlAP-

RR-00-17,2000.