Call for Paper - August 2022 Edition
IJCA solicits original research papers for the August 2022 Edition. Last date of manuscript submission is July 20, 2022. Read More

Mining Text for Meaningful Words with Stemming Algorithm

Print
PDF
IJCA Proceedings on Trends in Advanced Computing and Information Technology
© 2016 by IJCA Journal
TACIT 2016 - Number 1
Year of Publication: 2016
Authors:
Priti Shende
V. B. Kute

Priti Shende and V B Kute. Article: Mining Text for Meaningful Words with Stemming Algorithm. IJCA Proceedings on Trends in Advanced Computing and Information Technology TACIT 2016(1):13-16, August 2016. Full text available. BibTeX

@article{key:article,
	author = {Priti Shende and V. B. Kute},
	title = {Article: Mining Text for Meaningful Words with Stemming Algorithm},
	journal = {IJCA Proceedings on Trends in Advanced Computing and Information Technology},
	year = {2016},
	volume = {TACIT 2016},
	number = {1},
	pages = {13-16},
	month = {August},
	note = {Full text available}
}

Abstract

With the growth of explosive Internet information, data availability is easy. However, raw data is useful when mined. Therefore, mining is an important research area. The text mining primarily aims at discovery and retrieval of useful and interesting patterns from a large database. Identification and understanding of appropriate words is important to retrieve appropriate documents. Referring dictionary is time consuming and tedious job for understanding meaning of words every time. This can be prevented by converting different occurrences of word forms to its root. Frequency of words occurrences in a file used to prioritized documents. This works target avoidance of incomplete and meaningless words generation using stemming. We propose a method to compare different forms of words present in the document up to certain length. Sixty percent length of the word considered for comparison. Words having common letters are considered as different forms of same root.

References

  • Ms. Anjali Ganesh Jivani, "A comparative study of Stemming algorithms", in Int. J. Comp. Tech. Appl. , Vol 2 (6), 1930-1938
  • Wahiba Ben Abdessalem Karaa, "A new stemmer to improve information retrieval", in International Journal of Network Security And Its Applications(IJNSA), Vol. 5, No. 4, July 2013
  • Prasenjit Majumder, Mandar Mitra, Swapnil K. Parui and Gobinda Kole , Pabitra Mitra and Kalyankumar Datta, "YASS: Yet Another Suffix Stripper", ACM transactions on information systems, vol. 25, no. 4, article 18, publication date: October 2007
  • K. K. Agbele, A. O. Adesina, N. A. Azeez , & A. P. Abidoye, "Context-Aware Stemming algorithm for semantically related root words", in African Journal of Computing & ICT Vol 5. No. 4, June 2012
  • Peter Willet, "The Porter stemming algorithm: then and now", in electronic library and information systems, 40(3). pp. 219-223
  • M. F. Porter, "An algorithm for suffix stripping", Originally published in Program, Vo1. 4 no. 3, pp 130-137, July 1980.
  • Danilo Saft and Volker Nissen, "Analysing full text content by means of flexible co-citation analysis inspired text mining method- exploring 15 years of JASSS articles", Int. J. Business Intelligence and Data Mining, Vol. 9, No. 1, 2014
  • B. P. Pande, Pawan Tamta, H. S. Dhami, "Generation, Implementation and Appraisal of an N-gram based Stemming Algorithm", in press
  • William B. Frakes, Christopher J. Fox, "Strength and similarity of affix removal stemming algorithm", in press