International Journal of Advances in Computer Science and Its Applications
Author(s) : R B V SUBRAMANYAM , T RAMAKRISHNUDU
Mining frequent and infrequent itemsets from a given dataset is the most significant area of data mining. When we mine both frequent and infrequent itemsets simultaneously, infrequent itemsets become very important because there are many useful negative association rules in them. Infrequent weighted Itemset mining is the process of mining infrequent items from weighted dataset. Many of the weighted Itemset mining algorithms scan the dataset many times. When the dataset size is very large, both memory usage and computational cost of mining algorithm is very expensive. In addition, single machine memory and other resources are insufficient to handle very large weighted datasets. Parallel and distributed computing is the alternative solution for these types of problems. In this paper we proposed infrequent weighted Itemset method on Hadoop-MapReduce framework, which can handle huge datasets. Experiments are performed on 8 node cluster with a synthetic dataset. The experimental results show that the proposed method is very efficient in handling very large datasets.