Scaling up support vector data description by using core-sets

Calvin S. Chu, Ivor W. Tsang, James T. Kwok

Abstract: Support vector data description (SVDD) is a powerful kernel method that has been commonly used for novelty detection. While its quadratic programming formulation has the important computational advantage of avoiding the problem of local minimum, this has a runtime complexity of $O(N^3)$, where $N$ is the number of training patterns. It thus becomes prohibitive when the data set is large. Inspired from the use of core-sets in approximating the minimum enclosing ball problem in computational geometry, we propose in this paper an approximation method that allows SVDD to scale better to larger data sets. Most importantly, the proposed method has a running time that is only linear in $N$. Experimental results on two large real-world data sets demonstrate that the proposed method can handle data sets that are much larger than those that can be handled by standard SVDD packages, while its approximate solution still attains equally good, or sometimes even better, novelty detection performance.

Proceedings of the International Joint Conference on Neural Networks (IJCNN 2004), pp.425-430, Budapest, Hungary, July 2004.


Back to James Kwok's home page.