Username:
Password:

For new collaborations:
Register with the YRC



Search for proteins:


Search: Descriptions Names
Advanced Search


Visit the YRC PDR


Click here to send feedback.


 

Computational Biology

Interpeting large, complex genomic data sets requires sophisticated analytical tools. The field of machine learning has contributed an increasing variety of such tools, including probabilistic methods such as hidden Markov models and Bayesian networks as well as nonparametric statistical techniques such as the support vector machine. To be useful, such methods must produce accurate, interpretable results. They must also be capable of handling a variety of different types of data, sometimes simultaneously, and they must scale to very large data sets.

In the context of yeast genomics and proteomics, we have applied the support vector machine (SVM) algorithm to many different problems. The SVM is a classification algorithm, fuctionally similar to decision trees and artificial neural networks. The SVM is supervised, in the sense that it relies upon an annotated set of training data for the learning phase. During the subsequent prediction phase, the SVM predicts classifications for unannotated test data. From a learning theoretic perspective, the SVM has a strong foundation. Empirically, the algorithm has been used to obtain state-of-the-art performance in applications as diverse as handwriting recognition and natural language processing. Here are problems that we have addressed using SVMs: