A kernel approach for learning from almost orthogonal patterns

Bernhard Schoelkopf, Jason Weston, Eleazar Eskin, Christina Leslie and William Stafford Noble

Proceedings of the 13th European Conference on Machine Learning, August 19-23, 2002. pp. 511-528.


In kernel methods, all the information about the training data is contained in the Gram matrix. If this matrix has large diagonal values, which arises for many types of kernels, then kernel methods do not perform well. We propose and test several methods for dealing with this problem by reducing the dynamic range of the matrix while preserving the positive definiteness of the Hessian of the quadratic programming problem that one has to solve when training a Support Vector Machine.