Spam Mail Filtering Technique using Different Decision Tree Classifiers through Data Mining Approach - A Comparative Performance Analysis
![]() |
10.5120/7274-0435 |
Sarit Chakraborty and Bikromadittya Mondal. Article: Spam Mail Filtering Technique using Different Decision Tree Classifiers through Data Mining Approach - A Comparative Performance Analysis. International Journal of Computer Applications 47(16):26-31, June 2012. Full text available. BibTeX
@article{key:article, author = {Sarit Chakraborty and Bikromadittya Mondal}, title = {Article: Spam Mail Filtering Technique using Different Decision Tree Classifiers through Data Mining Approach - A Comparative Performance Analysis}, journal = {International Journal of Computer Applications}, year = {2012}, volume = {47}, number = {16}, pages = {26-31}, month = {June}, note = {Full text available} }
Abstract
In recent years the highestdegree of communication happens through e-mails which are often affected by passive or active attacks. Effective spam filtering measures are the timely requirement to handle such attacks. Many efficient spam filters are available now-a-days with different degrees of performance and usually the accuracy level varies between 60-80% on an average. But most of the filtering techniques are unable to handle frequent changing scenario of spam mails adopted by the spammers over the time. Therefore improved spam control algorithms or enhancing the efficiency of various existing data mining algorithms to its fullest extent are the utmost requirement. In this paper three types of decision tree classifying techniques which are basically data mining classifiers namely Naïve Bayes Tree classifier (NBT), C 4. 5 (or J48) decision tree classifier and Logistic Model Tree classifier (LMT) are studied and analyzed for spam mail filtration. The test results depict that LMT is giving the most efficient result in terms of performance with almost 90% accuracy level to detect spam mails and non-spam (HAM) mails.
References
- P. Sudhakar, G. Poonkuzhali, K. Thaigarajan, K. Sarukesi, International Journal of Compuers, Issue 3, Volume 5, 2011, P. 332-345
- Almeida T, Yamakami A, Almeida J (2009) Evaluation of approaches for dimensionality reduction applied with Naive Bayes anti-spam filters. In: Proceedings of the 8th IEEE international conference on machine learning and applications, Miami, FL, USA, pp 517–522
- Cormack G (2008) Email spam filtering: a systematic review. Found Trends InfRetr 1(4):335–455
- Machine Learning Techniques in Spam FilteringKonstantin Tretyakov, kt@ut. ee, Institute of Computer Science, University of Tartu, Data Mining Problem-oriented Seminar, MTAT. 03. 177, May 2004, pp. 60-79.
- A Study on Email Spam Filtering Techniques, Christina V et. all. International Journal of Computer Applications (0975 – 8887) Volume 12– No. 1, December 2010, pp. 07-09
- Adaptive Spai Mail Filtering Using Genetic Algorithm,SanpakdeU et. all. Advanced Communication Technology, 2006. ICACT 2006. The 8th International Conference, 20-22 Feb. 2006, Vol 1, 441 - 445
- J. Quinlan. C 4. 5: Programs for Machine Learning. Morgan Kaufmann, 1992.
- V. Christina et al. Email Spam Filtering using Supervised Machine Learning Techniques. International Journal on Computer Science and Engineering (IJCSE) Vol. 02, No. 09, 2010, 3126-3129
- Ahmed Khorsi, "An Overview of Content-based Spam Filtering Techniques", Informatica, vol. 31, no. 3, October 2007, pp 269-277.
- Weka. WEKA (Data Mining Software). Available athttp://www. cs. waikato. ac. nz/ml/weka/. 2006