CFP last date
20 May 2024
Reseach Article

Correlated Concept based Topic Updation Model for Dynamic Corpora

by J. Jayabharathy, S. Kanmani, N. Sivaranjani
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 89 - Number 10
Year of Publication: 2014
Authors: J. Jayabharathy, S. Kanmani, N. Sivaranjani
10.5120/15664-3467

J. Jayabharathy, S. Kanmani, N. Sivaranjani . Correlated Concept based Topic Updation Model for Dynamic Corpora. International Journal of Computer Applications. 89, 10 ( March 2014), 1-7. DOI=10.5120/15664-3467

@article{ 10.5120/15664-3467,
author = { J. Jayabharathy, S. Kanmani, N. Sivaranjani },
title = { Correlated Concept based Topic Updation Model for Dynamic Corpora },
journal = { International Journal of Computer Applications },
issue_date = { March 2014 },
volume = { 89 },
number = { 10 },
month = { March },
year = { 2014 },
issn = { 0975-8887 },
pages = { 1-7 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume89/number10/15664-3467/ },
doi = { 10.5120/15664-3467 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:08:50.576337+05:30
%A J. Jayabharathy
%A S. Kanmani
%A N. Sivaranjani
%T Correlated Concept based Topic Updation Model for Dynamic Corpora
%J International Journal of Computer Applications
%@ 0975-8887
%V 89
%N 10
%P 1-7
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A rapid growth of documents available on the Internet, digital libraries, medical documents, news wires and other scientific document corpuses has motivated the researchers to propose many text mining techniques that help users to quickly retrieve trace and summarize the information in an effective way. Topic detection is one such technique which discovers precise, meaningful and concise labels for the formulated static document clusters. This technique helps the user to navigate and retrieve the needed information quickly and efficiently. Topic updation is the process of identifying and renewing the discovered labels whenever the document clusters are updated dynamically. This paper focuses on topic updation model based on Testor theory. The proposed work is experimented using 20newsgroup and scientific literature data set. The experimental results demonstrate that the proposed algorithm exhibit better performance, compared to the existing algorithms for topic detection.

References
  1. Jayabharathy. J, Kanmani. S and Ayeesha Parveen. A. 2011, "A Survey of Document Clustering Algorithms with Topic Discovery", Journal of Computing, Vol. 3, No. 2, pp. 21-27.
  2. Kim. N, Tam. N and Van. N. 2013, " Document Clustering Using Dirichlet Process Mixture Models of Von Mises –Fisher Distributions", Proceedings of the Fourth Symposium on Information and Communication Technology, pp 131-138 .
  3. Jayabharathy. J, Kanmani. S, and Ayeshaa Parveen. A. 2011. "Document Clustering and Topic Discovery based on Semantic Similarity in Scientific Literature", IEEE 3rd International Communication Software and Networks (ICCSN), pp 425 – 429.
  4. Li. F, Zhu. Q and Lin. X. 2009. "Topic Discovery in Research literature Based on Non-negative Matrix Factorization and Testor theory", IEEE Asia-Pacific Conference on Information Processing.
  5. Gad, W. K. , & Kamel, M. S. 2010. "Incremental clustering algorithm based on phrase- semantic similarity histogram", Proceedings of the Ninth International Conference on Machine Learning and Cybernetics, Vol. 11 No. 14, pp. 2088–2093.
  6. Gavin, S. , & Yue, X. 2009. "Enhancing an incremental clustering algorithm for Web page collections", IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, pp. 81–84.
  7. Wartena. C and Brussee. R (2008), "Topic Detection by Clustering Keywords", IEEE 19th International Conference and Expert System Application.
  8. Seymore. K and Rosenfeld. R 1997, "Largescale Topic Detection And Language Model Adaptation", Technical Report CMU-CS-97-152,
  9. Anaya-Sánchez. H, Pons-Porrata. A, and Berlanga-Llavori. R (2008). " A New Document Clustering Algorithm for Topic Discovering and Labeling", CIARP 2008, Progress in Pattern Recognition, Image Analysis and Applications Lecture Notes in Computer Science Vol 5197, pp 161-168.
  10. Anaya-Sánchez. H,. , Pons-Porrata. A and Berlanga-Llavori. R 2010. "A document clustering algorithm for discovering and describing topics", Pattern Recognition Letters, Elsevier Volume 31, No. 6, 15, pp 502–510.
  11. Wang. H, Huang. T, Guo. J and Li. S 2009, "Journal Article Topic Detection Based on Semantic Features", 22nd IEA/AIE, Proceeding In Springer, Next-Generation Applied Intelligence, Lecture Notes in Computer Science, Volume 5579, 2009, pp 644-652.
  12. Song. X, Lin. C, Tseng. B, and. Sun. M. 2005, "Modeling and predicting personal information dissemination behavior," Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data mining.
  13. Wang. X, Mohanty. N, and McCallum. A. 2005, "Group and topic discovery from relations and text," The 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Workshop on Link Discovery: Issues, Approaches and Applications, pp. 28-35.
  14. Wang. X and McCallum. A. 2006, "Topics over Time: A Non- Markov Continuous-Time Model of Topical Trends," ACM SIGKDD international conference on Knowledge discovery in data mining.
  15. AlSumait. L, Barbar´a. D, Domeniconi. C , On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking", IEEE International Conference on Data Mining, pp. 3-12.
  16. Yang. Y, Pierce. T, and Carbonell. J 1998. A Study on Retrospective and Online Event Detection. In SIGIR, 1998.
  17. Saha. A and Sindhwani. V. 2012, "Learning Evolving and Emerging Topics in Social Media: A Dynamic NMF approach with Temporal Regularization", ACM Proceedings of the Fifth International Conference on Web Search and Web Data Mining, WSDM 2012, Seattle, USA, February pp. 8-12.
  18. Chou. T and Chang. M. 2008. "Using Incremental PLSI for Treshhold-Resilient Online Event Analysis". IEEE transactions on Knowledge and Data Engineering.
  19. Gohr. A, Hinneburg. A, Schult. R, and Spiliopoulou. M 2009. "Topic evolution in a stream of documents". In SDM.
  20. Matthew D. Homan, David M. Blei, and Frances Bach 2010. "Online learning for latent dirichlet allocation". In NIPS, 2010.
  21. Blei. D and Laferty. J 2006. "Dynamic topic models". In ICML, 2006.
  22. AlSumait. L, Barbara. D, and Domeniconi. C 2008. "On-line Emerging Topics in IBM Tweets lda: Adaptive topic models for mining text streams". In ICDM.
  23. Prathima, Y. , & Supreethi, K. P. 2011. "A survey paper on concept based text clustering", International Journal of Research in IT & Management, vol. 1 No. 3,pp. 45–60.
  24. Frakes, W. B. , & Fox, C. J. 2003. Strength and Similarity of Affix Removal Stemming Algorithms ACMSIGIR Forum, pp. 26–30.
  25. Jayabharathy. J and Kanmani. S, "Correlated concept based dynamic document clustering algorithms for newsgroups and scientific literature", Journal on Decision Analytics, Springeropen, Vol. 1, Issue. 3. doi:10. 1186/2193-8636-1-3
  26. Steinbach, M. , Karypis, G. , & Kumar, V. 2000. A Comparison of Document Clustering Techniques (pp. 1–2). International Conference on Data Mining: Knowledge Discovery and Data Mining (KDD) Workshop on Text Mining.
  27. Huang. A . 2008, "Similarity Measures for Text Document Clustering", NZCRSC'08, April 2008.
Index Terms

Computer Science
Information Sciences

Keywords

Document Clustering Static Clustering Dynamic Clustering Topic detection Topic updation Testor Theory F-Measure and Purity.