Mining Text for Meaningful Words with Stemming Algorithm

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

A Novel Solution Approach using Linearization Technique for Nonlinear Programming Problems

Aug

2018

Standard Monitor Design for SLA Parameters in SOA

May

2012

Iris Biometric Recognition for Person Identification in Security Systems

June

2011

Creating Digital Archives of Ancient Scriptures using CAPTCHAs

February

2013

Reseach Article

Mining Text for Meaningful Words with Stemming Algorithm

Published on August 2016 by Priti Shende, V. B. Kute

Advanced Computing and Information Technology

Foundation of Computer Science USA

TACIT2016 - Number 1

August 2016

Authors: Priti Shende, V. B. Kute

f2267d75-9d98-4aa3-8600-8ee75a9d5201

Priti Shende, V. B. Kute . Mining Text for Meaningful Words with Stemming Algorithm. Advanced Computing and Information Technology. TACIT2016, 1 (August 2016), 13-16.

@article{

author = { Priti Shende, V. B. Kute },

title = { Mining Text for Meaningful Words with Stemming Algorithm },

journal = { Advanced Computing and Information Technology },

issue_date = { August 2016 },

volume = { TACIT2016 },

number = { 1 },

month = { August },

year = { 2016 },

issn = 0975-8887,

pages = { 13-16 },

numpages = 4,

url = { /proceedings/tacit2016/number1/25830-it43/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 Advanced Computing and Information Technology

%A Priti Shende

%A V. B. Kute

%T Mining Text for Meaningful Words with Stemming Algorithm

%J Advanced Computing and Information Technology

%@ 0975-8887

%V TACIT2016

%N 1

%P 13-16

%D 2016

%I International Journal of Computer Applications

Abstract

With the growth of explosive Internet information, data availability is easy. However, raw data is useful when mined. Therefore, mining is an important research area. The text mining primarily aims at discovery and retrieval of useful and interesting patterns from a large database. Identification and understanding of appropriate words is important to retrieve appropriate documents. Referring dictionary is time consuming and tedious job for understanding meaning of words every time. This can be prevented by converting different occurrences of word forms to its root. Frequency of words occurrences in a file used to prioritized documents. This works target avoidance of incomplete and meaningless words generation using stemming. We propose a method to compare different forms of words present in the document up to certain length. Sixty percent length of the word considered for comparison. Words having common letters are considered as different forms of same root.

References

Ms. Anjali Ganesh Jivani, "A comparative study of Stemming algorithms", in Int. J. Comp. Tech. Appl. , Vol 2 (6), 1930-1938
Wahiba Ben Abdessalem Karaa, "A new stemmer to improve information retrieval", in International Journal of Network Security And Its Applications(IJNSA), Vol. 5, No. 4, July 2013
Prasenjit Majumder, Mandar Mitra, Swapnil K. Parui and Gobinda Kole , Pabitra Mitra and Kalyankumar Datta, "YASS: Yet Another Suffix Stripper", ACM transactions on information systems, vol. 25, no. 4, article 18, publication date: October 2007
K. K. Agbele, A. O. Adesina, N. A. Azeez , & A. P. Abidoye, "Context-Aware Stemming algorithm for semantically related root words", in African Journal of Computing & ICT Vol 5. No. 4, June 2012
Peter Willet, "The Porter stemming algorithm: then and now", in electronic library and information systems, 40(3). pp. 219-223
M. F. Porter, "An algorithm for suffix stripping", Originally published in Program, Vo1. 4 no. 3, pp 130-137, July 1980.
Danilo Saft and Volker Nissen, "Analysing full text content by means of flexible co-citation analysis inspired text mining method- exploring 15 years of JASSS articles", Int. J. Business Intelligence and Data Mining, Vol. 9, No. 1, 2014
B. P. Pande, Pawan Tamta, H. S. Dhami, "Generation, Implementation and Appraisal of an N-gram based Stemming Algorithm", in press
William B. Frakes, Christopher J. Fox, "Strength and similarity of affix removal stemming algorithm", in press

Index Terms

Computer Science

Information Sciences

Keywords

Complete Words Sixty Percent Length Porter's Stemming Algoithm