CFP last date
20 May 2024
Reseach Article

Automatic Document Collection

by Shashikant, Mukesh Rawat
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 70 - Number 25
Year of Publication: 2013
Authors: Shashikant, Mukesh Rawat
10.5120/12221-8137

Shashikant, Mukesh Rawat . Automatic Document Collection. International Journal of Computer Applications. 70, 25 ( May 2013), 9-12. DOI=10.5120/12221-8137

@article{ 10.5120/12221-8137,
author = { Shashikant, Mukesh Rawat },
title = { Automatic Document Collection },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 70 },
number = { 25 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 9-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume70/number25/12221-8137/ },
doi = { 10.5120/12221-8137 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:33:47.331441+05:30
%A Shashikant
%A Mukesh Rawat
%T Automatic Document Collection
%J International Journal of Computer Applications
%@ 0975-8887
%V 70
%N 25
%P 9-12
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Now a day's classification of document is an important area for research, as large amount of electronic documents are available in form of unstructured, semi structured and structured information. Document classification will be applicable for World Wide Web, electronic book sites, online forums, electronic mails, online blogs, digital libraries and online government repositories. So it is necessary to organize the information and proper categorization and knowledge discovery is also important. This paper focused on the existing literature and explored the techniques for automatic documents classification i. e. documents representation, knowledge extraction and classification. In this paper author propose an algorithm and architecture for automatic document collection.

References
  1. Aurangzeb Khan, Baharum B. Bahurdin, Khairullah Khan, "An Overview of E-Documents Classification", 2009 International Conference on Machine Learning and Computing IPCSIT vol. 3 (2011) © (2011) IACSIT Press, Singapore.
  2. S. Gopal, Y. Yang. Multilabel classification with meta-level features. ACM SIGIR Conference, 2010.
  3. J. R. Quinlan, Induction of Decision Trees, Machine Learning,1(1), pp 81–106, 1986.
  4. B. Liu, W. Hsu, Y. Ma. Integrating Classification and Association Rule Mining. ACM KDD Conference, 1998.
  5. C. Cortes, V. Vapnik. Support-vector networks. Machine Learning, 20: pp. 273–297, 1995.
  6. M. Sahami. Learning limited dependence Bayesian classifiers, ACM KDD Conference, 1996.
  7. B. Liu, L. Zhang. A Survey of Opinion Mining and Sentiment Analysis. Book Chapter in Mining Text Data, Ed. C. Aggarwal, C. Zhai, Springer, 2011.
  8. M. Sahami, S. Dumais, D. Heckerman, E. Horvitz. A Bayesian approach to filtering junk e-mail. AAAI Workshop on Learning for Text Categorization. Tech. Rep. WS-98-05, AAAI Press. http://robotics. stanford. edu/users/sahami/papers. html.
  9. A. Y. Ng, M. I. Jordan. On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. NIPS. pp. 841- 848, 2001.
  10. J. R. Quinlan, Induction of Decision Trees, Machine Learning, 1(1), pp 81–106, 1986.
  11. P. Long, R. Servedio. Random Classification Noise defeats all Convex Potential Boosters. ICML Conference, 2008.
  12. S. A. Macskassy, F. Provost. Classification in Networked Data: AToolkit and a Univariate Case Study, Journal of Machine Learning Research, Vol. 8, pp. 935–983, 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Text mining Web mining Automatic document classification