CFP last date
20 May 2024
Reseach Article

An Efficient Incremental Clustering based Summarization Technique for Web Page Classification

by Setu Kumar Chaturvedi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 80 - Number 14
Year of Publication: 2013
Authors: Setu Kumar Chaturvedi
10.5120/13926-1416

Setu Kumar Chaturvedi . An Efficient Incremental Clustering based Summarization Technique for Web Page Classification. International Journal of Computer Applications. 80, 14 ( October 2013), 1-8. DOI=10.5120/13926-1416

@article{ 10.5120/13926-1416,
author = { Setu Kumar Chaturvedi },
title = { An Efficient Incremental Clustering based Summarization Technique for Web Page Classification },
journal = { International Journal of Computer Applications },
issue_date = { October 2013 },
volume = { 80 },
number = { 14 },
month = { October },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume80/number14/13926-1416/ },
doi = { 10.5120/13926-1416 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:54:30.796187+05:30
%A Setu Kumar Chaturvedi
%T An Efficient Incremental Clustering based Summarization Technique for Web Page Classification
%J International Journal of Computer Applications
%@ 0975-8887
%V 80
%N 14
%P 1-8
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Currently the World Wide Web is the largest source of information. There are numerous self-acting classification advances that have been suggested. In this suggested work an effective incremental clustering approach to evolve a better incremental clustering based summarization method for world wide World Wide Web classification which can facilitate to better coordinate the accessible data on WWW. The incremental clustering founded summarization technique permits dynamic tracking of the ever expanding allotment of information, put on the World Wide Web every day. This is the useful procedure for dynamic contents. In this the clustering of web document is a key method for finding out a more contextual and noiseless knowledge for world wide world wide web utilizes. C4. 5 is one of the most classic classification algorithms on data excavation, when it is utilized in mass computed results, the effectiveness is very low. In this paper, the direct of C4. 5 is advanced by the use of L'Hospital direct, which simplifies the assessment process and advances the effectiveness of decision making algorithm. The aim of this work to apply the algorithms in a very time and space effective kind and throughput and answer time for the application will be encouraged as the presentation measures. The aspires is to implement these algorithms and graphically compared the complexities and efficiencies of the classification algorithms.

References
  1. I. H. Witten, E. Frank, Data Mining Practical Machine Learning Tools and Techniques, China Machine Press, 2006.
  2. S. F. Chen, Z. Q. Chen, Artificial intelligence in knowledge engineering [M]. Nanjing: Nanjing University Press, 1997.
  3. Z. Z. Shi, Senior Artificial Intelligence [M]. Beijing: Science Press,1998.
  4. D. Jiang, Information Theory and Coding [M]: Science and Technology of China University Press, 2001.
  5. M. Zhu, Data Mining [M]. Hefei: China University of Science and Technology Press ,2002. 67-72.
  6. A. P. Engelbrecht. , A new pruning heuristic based on variance analysis of sensitivity information[J]. IEEE Trans on Neural Networks, 2001, 12(6): 1386-1399.
  7. N. Kwad, C. H. Choi, Input feature selection for classification problem [J],IEEE Trans on Neural Networks, 2002,13(1): 143- 159.
  8. Quinlan JR. Induction of decision tree [J]. Machine Learing. 1986
  9. Quinlan,J. R. C4. 5:ProgramsforMachineLearning. SanMateo, CA:Morgan Kaufmann1993
  10. UCIRepository of machine earning databases. University of California, Department of Information and Computer Science, 1998. http: //www. ics. uci. edu/?mlearn/MLRepository. Html
  11. T. Joachims, "Web document categorization with Support Vector Machines: Learning with Many Relevant Features", Proceedings European Conference on Machine Learning (ECML), Issue 5, pp. 153-162, Berlin, 1998.
  12. M. Sravanthi, C. R. Chowdary, and P. S. Kumar, "QueSTS: A query specific text summarization system" Proceedings of the
  13. 21st International FLAIRS Conference, Florida, AAAI Press, USA, pages 219–224, 2008.
  14. X. Wan, J. Yang, and J. Xiao. "Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction",Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 552–559, Prague, ACL, Czech Republic, June 2007.
  15. Weka Tool, "www. cs. waikato. ac. nz/ml/weka/".
Index Terms

Computer Science
Information Sciences

Keywords

Clustering summarization Classification Decision Tree C4. 5 L'Hospital Rule the rate of information gain.