Multi-class protein fold recognition using adaptive codes

Eugene Ie, Jason Weston, William Stafford Noble, Christina Leslie

Proceedings of the 22nd International Conference on Machine Learning, August 7-11, 2005, Bonn, Germany.


We develop a novel multi-class classification method based on output codes for the problem of classifying a sequence of amino acids into one of many known protein structural classes, called folds. Our method learns relative weights between one-vs-all classifiers and encodes information about the protein structural hierarchy for multi-class prediction. Our code weighting approach significantly improves on the standard one-vs-all method for the fold recognition problem. In order to compare against widely used methods in protein sequence analysis, we also test nearest neighbor approaches based on the PSI-BLAST algorithm. Our code weight learning algorithm strongly outperforms these PSI-BLAST methods on every structure recognition problem we consider.

Supplementary data and code