Call for Paper - January 2022 Edition
IJCA solicits original research papers for the January 2022 Edition. Last date of manuscript submission is December 20, 2021. Read More

Survey of Text Mining Techniques, Challenges and their Applications

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
N. Venkata Sailaja, L. Padmasree, N. Mangathayaru

Venkata N Sailaja, L Padmasree and N Mangathayaru. Survey of Text Mining Techniques, Challenges and their Applications. International Journal of Computer Applications 146(11):30-35, July 2016. BibTeX

	author = {N. Venkata Sailaja and L. Padmasree and N. Mangathayaru},
	title = {Survey of Text Mining Techniques, Challenges and their Applications},
	journal = {International Journal of Computer Applications},
	issue_date = {July 2016},
	volume = {146},
	number = {11},
	month = {Jul},
	year = {2016},
	issn = {0975-8887},
	pages = {30-35},
	numpages = {6},
	url = {},
	doi = {10.5120/ijca2016910908},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


In our everyday life communication interaction among people leading to mutual learning and sharing of valuable knowledge, such as chat, messaging, comments, and posts on board etc. Also, social networking websites, search engines sharing huge data texts in websites. The text is nothing but the combination of characters. Therefore, analyzing and extracting information patterns from such data sets are more complex. Several methods have been proposed for analyzing such texts and extracting information.

In this paper, we present different text mining techniques to discover various textual patterns from the different sources. This topic is also deals with the areas such as information retrieval, machine learning, statistics, computational data sciences and advanced data mining. We also discuss future challenges of this area using different techniques, particularly rough set based text mining techniques, improvements and research directions in this paper.


  1. Abdullah SaeedGhareb, Azuraliza Abu Bakar, Abdul RazakHamdan, “Hybrid feature selection based on enhanced genetic algorithm for text categorization”, Expert Systems With Applications 49 (2016).
  2. VishwanathBijalwan, Vinay Kumar, Pinki Kumari, Jordan Pascual, “KNN based Machine Learning Approach for Text and Document Mining”, International Journal of Database Theory and Application, Vol.7, No.1 (2014).
  3. DivyaNasa, “Text Mining Techniques- A Survey”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 4,April (2012).
  4. Robert Kudeliu, MladenKonecki, MirkoMalekoviu, “Mind Map Generator Software Model with Text Mining Algorithm”, 33 Int. Conf. on Information Technology Interfaces, June 27-30, (2011), Cavtat, Croatia.
  5. Libiao Zhang , Yuefeng Li, Chao Sun, WanvimolNadee, “Rough Set Based Approach to Text Classification”, IEEE/WIC/ACM International Conferences on Web Intelligence (WI) and Intelligent Agent Technology (IAT), (2013).
  6. SumitGoswami, Mayank Singh Shishodia, “A Fuzzy Based Approach To Text Mining And Document Clustering”,
  7. Maria Muntean, Lucia Căbulea, HonoriuVălean, “A New Text Clustering Method based on Huffman Encoding Algorithm”, (2014) IEEE.
  8. LincyLiptha R., Raja K., G.TholkappiaArasu, “Enhancing Text Clustering Using Concept based Mining Model”, IJECSE.
  9. A. Akilan, “Text Mining: Challenges and Future Directions”, IEEE (ICECS ‘2015).
  10. S. S. Dhenakaran and S. Yasodha, “Semantic web mining: A critical review,” International Journal of Computer Science and Information Technologies, 2011, vol. 2, no. 5, pp. 2258–2261.
  11. G. Stummea, A. Hotho, and B. Berendt, “Semantic web mining, State Of The Art And Future Directions A Knowledge And Data Engineering Group, University of Kassel, Institute of Information Systems, Humboldt University, Berlin, 2006.
  12. M. A. Aufaure, B. L. Grand, M. Soto, and N. Bennacer, “Metadataand ontology-based semantic web mining,” in Web semantics & ontology, D. Taniar and J. W. Rahayu, Eds., 2006, pp. 259–296.
  13. G. Sampson, M. D. Lytras, G. Wagner, and P. Diaz, “Ontologies and the semantic web for e-learning,” Educational Technology & Society, vol. 7, no. 4, pp. 26–28.
  14. Berry Michael W., (2004), “Automatic Discovery of Similar Words”, in “Survey of Text Mining: Clustering, Classification and Retrieval”, Springer Verlag, New York, LLC, 24-43.
  15. Navathe, Shamkant B., and ElmasriRamez, (2000), “Data Warehousing And Data Mining”, in “Fundamentals of Database Systems”, Pearson Education pvtInc, Singapore, 841-872.
  16. Weiguo Fan, Linda Wallace, Stephanie Rich, and Zhongju Zhang, (2005), “Tapping into the Power of Text Mining”, Journal of ACM, Blacksburg.
  17. Liu, F. & Lu, X. 2011. Survey on text clustering algorithm.In Proceedings of 2nd International IEEE Conference on Software Engineering and Services Science (ICSESS), China, 901-904, 2011.
  18. Luger, G. F. 2008. Artificial Intelligence: Structure and Strategies for Complex Problem Solving. 6th edn. Addison Wesley.
  19. Kano,Y., Baumgartner,W. A., McCrohon, L., Ananiadou, S., Cohen, K. B., Hunter, L. &Tsujii, T. 2009. Data Mining: Concept and Techniques. Oxford Journal of Bioinformatics, 25(15), 1997-1998.
  20. Yang, Y. and Liu, X. (1999). “A Re-examination of Text Categorization Methods, in Proceedings of the 22nd Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR’99), 1999, pp. 42-49.
  21. D.Q. Miao, Q.G. Duan, H.Y. Zhang, J. Na, Rough Set based Hybrid Algorithm for Text Classification. Expert Systems with Applications 36, pp. 8932-8937, 2012.
  22. Szymanski, J., Self-Orgaanizing Map Representation for Clustering Wikipedia Search Results. 2011.


Data mining, Text mining, Rough sets,Classification, Summarization, and Text categorization.