CFP last date
22 April 2024
Reseach Article

Automatic Naming of Domain Specific Clusters for Efficient Searching

by Anshika Nagpal, Mukesh Rawat
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 113 - Number 7
Year of Publication: 2015
Authors: Anshika Nagpal, Mukesh Rawat
10.5120/19842-1699

Anshika Nagpal, Mukesh Rawat . Automatic Naming of Domain Specific Clusters for Efficient Searching. International Journal of Computer Applications. 113, 7 ( March 2015), 46-48. DOI=10.5120/19842-1699

@article{ 10.5120/19842-1699,
author = { Anshika Nagpal, Mukesh Rawat },
title = { Automatic Naming of Domain Specific Clusters for Efficient Searching },
journal = { International Journal of Computer Applications },
issue_date = { March 2015 },
volume = { 113 },
number = { 7 },
month = { March },
year = { 2015 },
issn = { 0975-8887 },
pages = { 46-48 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume113/number7/19842-1699/ },
doi = { 10.5120/19842-1699 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:50:22.620955+05:30
%A Anshika Nagpal
%A Mukesh Rawat
%T Automatic Naming of Domain Specific Clusters for Efficient Searching
%J International Journal of Computer Applications
%@ 0975-8887
%V 113
%N 7
%P 46-48
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper proposes a new and efficient methodology for clustering of html documents. The topic wise categorization of documents into different clusters makes searching easier and efficient. This technique can be utilized by search engines to provide relevant results to the user according to query and also utilized by online journal domains that are maintaining large set of documents. This paper suggests a good word matching and naming of automatic generated clusters , so, the time consume for finding the appropriate cluster for a document will be reduced. This paper shows the use of an efficient technique for finding the similarity between the documents and assigns them a proper cluster. The proper clustering of documents will be further utilized by multidocument summarization system, which produces a summary for the documents related to each other.

References
  1. McCallum, Andrew Kachites. "MALLET: A Machine Learning for Language Toolkit. " http://mallet. cs. umass. edu. 2002.
  2. . S. Chakrabarti, B. Dom. R. Agrawal, P. Raghavan. Using taxonomy, discriminants and signatures for navigating in text databases, VLDB Conference, 1997.
  3. B. Liu, L. Zhang. A Survey of Opinion Mining and Sentiment Analysis. Book Chapter in Mining Text Data, Ed. C. Aggarwal, C. Zhai, Springer, 2011.
  4. M. Sahami, S. Dumais, D. Heckerman, E. Horvitz. A Bayesian approach to filtering junk e-mail. AAAI Workshop on Learning for Text Categorization. Tech. Rep. WS-98-05, AAAI Press. http://robotics. stanford. edu/users/sahami/papers. html.
  5. A. Y. Ng, M. I. Jordan. On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. NIPS. pp. 841- 848, 2001.
  6. J. R. Quinlan, Induction of Decision Trees, Machine Learning, 1(1), pp 81–106, 1986.
  7. A. McCallum, K. Nigam. A Comparison of Event Models for Naïve Bayes Text Classification. AAAI Workshop on Learning for Text Categorization, 1998.
  8. C. Cortes, V. Vapnik. Support-vector networks. Machine Learning, 20: pp. 273– 297, 1995.
  9. Fabrizio Silvestri, Raffaele Perego and Salvatore Orlando. "Assigning Document Identifiers to Enhance Compressibility of Web Search Engines Indexes" In the proceedings of SAC, 2004.
  10. Van Rijsbergen C. J. "Information Retrieval" Butterworth 1979
  11. Oren Zamir and Oren Etzioni. "Web Document Clustering: A feasibility demonstration" In the pr oceedings of SIGIR, 1998.
  12. Jain and R. Dubes. "Algorithms for Clustering Data. " Prentice Hall, 1988
  13. Sanjiv K. Bhatia. "Adaptive K Means Clustering" American Association for. Artificial Intelligence, 2004
  14. Bhatia, S. K. and Deougan , J. S. 1998. "Conceptual Clustering in Information and Cybernetics.
Index Terms

Computer Science
Information Sciences

Keywords

Keywords are clustering similarity clusters etc.