Email classification for Spam Detection using Word Stemming

D.Karthika Renuka; T.Hamsapriya

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

Navigating the Future of Cybersecurity: A Strategic Approach to Crypto Agility for Modern Enterprises

Aditya Gupta

Random Articles

Passenger Travel behavior Model in Railway Network Simulation

Apr

2017

Review of Application of Internet of Things in Agriculture in India

Aug

2018

Web Application Top 10 OWASP Attacks and Defence Mechanism

Aug

2023

An Incorporated Voting Strategy on Majority and Score- based Fuzzy Voting Algorithms for Safety-Critical Systems

July

2014

Reseach Article

Email classification for Spam Detection using Word Stemming

by D.Karthika Renuka, T.Hamsapriya

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 1 - Number 5

Year of Publication: 2010

Authors: D.Karthika Renuka, T.Hamsapriya

10.5120/125-241

D.Karthika Renuka, T.Hamsapriya . Email classification for Spam Detection using Word Stemming. International Journal of Computer Applications. 1, 5 ( February 2010), 45-47. DOI=10.5120/125-241

@article{ 10.5120/125-241,

author = { D.Karthika Renuka, T.Hamsapriya },

title = { Email classification for Spam Detection using Word Stemming },

journal = { International Journal of Computer Applications },

issue_date = { February 2010 },

volume = { 1 },

number = { 5 },

month = { February },

year = { 2010 },

issn = { 0975-8887 },

pages = { 45-47 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume1/number5/125-241/ },

doi = { 10.5120/125-241 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T19:44:23.020725+05:30

%A D.Karthika Renuka

%A T.Hamsapriya

%T Email classification for Spam Detection using Word Stemming

%J International Journal of Computer Applications

%@ 0975-8887

%V 1

%N 5

%P 45-47

%D 2010

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Unsolicited emails, known as spam, are one of the fast growing and costly problems associated with the Internet today. Among the many proposed solutions, a technique using Bayesian filtering is considered as the most effective weapon against spam. Bayesian filtering works by evaluating the probability of different words appearing in legitimate and spam mails and then classifying them based on that probabilities.Most of the current spam email detection systems use keywords to detect spam emails.These keywords can be written as misspellings eg: baank or bannk instead of bank. Misspellings are changed from time to time and hence spam email detection system needs to constantly update the blacklist to detect spam emails containing misspellings. It’s impossible to predict all possible misspellings for a given keyword and add those to the blacklist. In this paper a better and more successful approach for improving E-mail content classification for spam control is proposed. It used the Word Stemming or Word Hashing Technique for improving the efficiency of the content based spam filter.The proposed system extract the base or stem of a misspelled or modified word, to detect spam emails. It considers every misspelled keyword applies a word stemming technique and passes the base word to the content based filter. Using a proposed if-then rule, we can decide whether or not this unknown mail is spam [1].This paper also provides an Email archiving solution which classifies the E-mail relating to a person, family, corporation, association, community, or nation.

References

Leonard and Hsu, 2001. Bayesian methods: an analysis for statisticians and interdisciplinary researchers. Cambridge University Press, Cambridge.
Bernardo and Smith, 1994. Bayesian theory, John Wiley and Sons, Chi Chester.
Clayton, R. (2004). Stopping spam by extrusion detection. Proceedings of the First Conference on Email and Anti-Spam (CEAS).
Orwant J. et al. Mastering Algorithms with Perl. O’Reilly and Associates, ISBN: 1-56592-398-7, 1999.
Amavisd-new Home Page, http://www.ijs.si/software/amavisd, Accessed 01 July 2004.
Send mail Home Page, http://www.sendmail.org, Accessed 01, July 2004.
Spam Assassin Home Page, http://www.spamassassin.org, Accessed 01, July 2004.
Proc mail Home Page, http://www.procmail.org, Accessed 03, Mar 2004.
Graham, P. Better Baysian Filtering. In Proceedings of Spam Conference, 2003.
http://www.Blog Spam Database.com
http://www.Email Spam Filter Word List.com
http://www.ceas.cc/papers-2004/172.pdf.
Internet Users and Spam: What the attitudes and behavior of Internet users can tell us about fighting spam ,Deborah Fallows Pew Internet & American Life Project, Washington, DC, 20036 USA.

Index Terms

Computer Science

Information Sciences

Keywords

Spam Filters Bayesian content based spam filter Word Stemming Email Email archiving