International Journal of Environmental Engineering
Author(s) : SACHIYO ABURATANI, SHOHEI MARUYAMA , YASUO MATSUYAMA
The development of a method to annotate unknown gene functions is an important task in bioinformatics. The identification of the relevant genes to metabolic pathways is also helpful for understanding the genes. However, the relationships between metabolic pathways and genes are complicated. Thus, it is difficult to identify the relevant genes by linear models. In this study, we propose a new method based on the SVM approach, for inferring the genes involved in metabolic pathways from the gene expression profiles. To improve the classification performances of SVMs, we developed a method for finding the important interactions for classification, from a huge number of experiment combinations. The interactions selected by our method were added as new features to the training data set of the SVMs. Furthermore, feature selection by the Gini importance was applied, to avoid overlearning of the SVMs. To demonstrate the validity of our method, we trained SVMs with Saccharomyces cerevisiae gene expression profiles against eight metabolic pathways, and evaluated their classification performances. As a result, we achieved high performances with some metabolic pathways. Thus, our method is useful for inferring the relevant genes to metabolic pathways.