Journals Proceedings

International Journal of Advances in Computer Science and Its Applications

An Analysis of Feature Selection Methods for Multiclass Text Classification

Author(s) : MAYANK KALBHOR , SANJAY AGARWAL

Abstract

To classify objects into different classes, feature plays a vital role. So identification of best features is a backbone of classification process. In text classification, features are simple words, having very large dimension so finding the most appropriate feature set is a big challenge. This paper includes analysis of some feature selection methods for multi class text classification and checks their results on different classifier for an email classification. We run our experiments on 20NewGroups and PU corpora datasets. Experiments are done on some well-known feature selection method like Term Selection, Document Frequency, Mutual Information, Odds Ratio, Chi square and etc. This paper concludes that Mutual Information and Chi square are most appropriate for text classification

No fo Author(s) : 2
Page(s) : 557-561
Electronic ISSN : 2250 - 3765
Volume 8 : Issue 1
Views : 275   |   Download(s) : 179