archives

Visual Methods for Examining Support Vector Machine Results, with Applications to Gene Expression Data Analysis


Home 

About 

Browse 

Search 

Register 

Subscriptions 

Deposit Papers 

Help
    

Caragea, Doina, Cook, Dianne and Honavar, Vasant (2005) Visual Methods for Examining Support Vector Machine Results, with Applications to Gene Expression Data Analysis. Technical Report, Computer Science, Iowa State University.

Full text available as:Adobe PDF

Abstract

Support vector machines (SVM) offer a theoretically well-founded approach to automated learning of pattern classifiers. They have been proven to give highly accurate results in complex classification problems, for example, gene expression analysis. The SVM algorithm is also quite intuitive with a few inputs to vary in the fitting process and several outputs that are interesting to study. For many data mining tasks (e.g., cancer prediction) finding classifiers with good predictive accuracy is important, but understanding the classifier is equally important. By studying the classifier outputs we may be able to produce a simpler classifier, learn which variables are the important discriminators between classes, and find the samples that are problematic to the classification. Visual methods for exploratory data analysis can help us to study the outputs and complement automated classification algorithms in data mining. We present the use of tour-based methods to plot aspects of the SVM classifier. This approach provides insights about the cluster structure in the data, the nature of boundaries between clusters, and problematic outliers. Furthermore, tours can be used to assess the variable importance. We show how visual methods can be used as a complement to cross-validation methods in order to find good SVM input parameters for a particular data set.

Keywords:Classification Problems, Support Vector Machines, Visual Methods, Gene Expression Data
Subjects:Computing Methodologies: ARTIFICIAL INTELLIGENCE
Computing Methodologies: ARTIFICIAL INTELLIGENCE: Learning (K.3.2)
ID code:00000394
Deposited by:Doina Caragea on 13 December 2005



Contact site administrator at: ssg@cs.iastate.edu