Journals Proceedings

International Journal of Advances in Computer Science and Its Applications

Using bipartite graphs projected onto two dimensions for text classification



In our Big Data world, the amount of text being gathered is ever expanding. For many years, data curators have sought ways to group these documents and identify common topics. As the size of the problem increases, solutions that will scale are needed. The purpose of this work is to present a novel text classifier that can be used for text-mining and interactive information access. The model that is demonstrated can be used to extract hierarchical relations between topics, as well as to conducted unsupervised clustering of documents and keywords. The approach that is taken with this model is the use of a graphof words key term extraction and a dimensional projection of the bipartite graph of documents and key terms. This projection makes it possible for terms to be co-clustered in an efficient manner in relation to their documents and the documents in relation to their terms. Furthermore, the key term extraction process that is outlined can be scaled on a large corpus using a distributed processing system such as Apache Spark and the resultant model can be visually interacted with by users

No fo Author(s) : 2
Page(s) : 291-295
Electronic ISSN : 2250 - 3765
Volume 8 : Issue 1
Views : 231   |   Download(s) : 169