An Effective Supervised Streamed Text Classification Approach for Mining Positive and Negative Examples

Safdar Sardar Khan; Divakar Singh

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2026

Submit your paper

Know more

The week's pick

Optimum to Effective Soil Nail Inclination for Slope Stability using GeoStudio

Md. Naimur Rahman

Random Articles

Coverage Optimization for WNSs using AI Technique

Jan

2020

Computation of the Optimal Probability of becoming a Cluster Head in Hierarchical Clustered WSNs

August

2013

An Approach to Cryptosystem through a Proposed and Secured Protocol

July

2012

Article:Semantic Linking and Querying of Natural Food, Chemicals and Diseases

December

2010

Reseach Article

An Effective Supervised Streamed Text Classification Approach for Mining Positive and Negative Examples

by Safdar Sardar Khan, Divakar Singh

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 75 - Number 1

Year of Publication: 2013

Authors: Safdar Sardar Khan, Divakar Singh

10.5120/13075-7334

Safdar Sardar Khan, Divakar Singh . An Effective Supervised Streamed Text Classification Approach for Mining Positive and Negative Examples. International Journal of Computer Applications. 75, 1 ( August 2013), 24-29. DOI=10.5120/13075-7334

@article{ 10.5120/13075-7334,

author = { Safdar Sardar Khan, Divakar Singh },

title = { An Effective Supervised Streamed Text Classification Approach for Mining Positive and Negative Examples },

journal = { International Journal of Computer Applications },

issue_date = { August 2013 },

volume = { 75 },

number = { 1 },

month = { August },

year = { 2013 },

issn = { 0975-8887 },

pages = { 24-29 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume75/number1/13075-7334/ },

doi = { 10.5120/13075-7334 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:43:07.472606+05:30

%A Safdar Sardar Khan

%A Divakar Singh

%T An Effective Supervised Streamed Text Classification Approach for Mining Positive and Negative Examples

%J International Journal of Computer Applications

%@ 0975-8887

%V 75

%N 1

%P 24-29

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the field of text mining. This survey paper is based on effective classification of streamed data for text mining by PNLH & one-class classification SVM for text contained audit, we consider the problem of one-class classification of text streams with respect to concept drift where a large volume of documents arrives at a high speed and with change of user interests and data distribution. In this case, only a small number of positively labelled documents is available for training. And text classification without negative examples revisit, by this we propose a labelling heuristic called PNLH to tackle this problem. PNLH aims at extracting high quality positive examples and negative examples from U and our survey can be used on top of any existing classifiers.

References

D. R. Cutting, D. R. Karger, J. O. Pederson, and J. W. Tukey, "Scatter/Gather a Cluster-Based Approach to Browsing Large Document Collections," Proc. 15th Int'l Conf. Research and Development in Information Retrieval, 1992.
H. Schutze, D. A. Hull, and J. O. Pedersen, "A Comparison of Classifiers and Document Representations for the Routing Problem," Proc. 18th Int'l Conf. Research and Development in Information Retrieval, 1995.
D. Bennett and A. Demiritz, "Semi-Supervised Support VectorMachines," Advances in Neural Information Processing Systems,vol. 11, 1998.
P. Bradley and U. Fayyad, "Refining Initial Points for k-Means Clustering," Proc. 15th Int'l Conf. Machine Learning, 1998.
T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Proc. 10th European Conf. Machine Learning, 1998.
R. Klinkenberg and I. Renz, "Adaptive information filtering: learning in the presence of concept drifts". Workshop Notes of the ICML-98Workshop on Learning for Text Categorization, pages 33–40, 1998.
B. Larsen and C. Aone, "Fast and Effective Text Mining Using Linear-Time Document Clustering," Proc. Fifth Int'l Conf. Knowledge Discovery and Data Mining, 1999.
T. Zhang, "The Value of Unlabeled Data for Classification Problems," Proc. 17th Int'l Conf. Machine Learning, 2000.
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell, "Text Classification from Labeled and Unlabeled Documents Using EM," Machine Learning, vol. 39, 2000.
R. Klinkenberg and T. Joachims, "Detecting concept drift with support vector machines," In Proceedings of the Seventeenth International Conference on Machine Learning (ICML'00), pages 487–494, 2000.
T. Dietterich, "Ensemble methods in machine learning," Proceedings of the First International Workshop on Multiple Classifier Systems, pages 1–15, 2000.
W. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for large-scale classification," Proceedings of the seventh international conference on Knowledge discovery and data mining, (KDD'01), pages 377–382, 2001.
D. Tax. One-class classification, "Doctoral dissertation," Delft University of Technology, 2001.
Y. Yang, "A Study on Thresholding Strategies for Text Categorization," Proc. 24th Int'l Conf. Research and Development in Information Retrieval, 2001.
J. Allan, "Topic detection and tracking," event-based information organization Kluwer Academic Publishers, 2002.
F. Sebastiani, "Machine learning in automated text categorization," ACM Computing Surveys, vol. 1 pages 1–47, 2002.
J. Bockhorst and M. Craven, "Exploiting Relations Among Concepts to Acquire Weakly Labeled Training Data," Proc. 19th Int'l Conf. Machine Learning, 2002.
R. Ghani, "Combining Labeled and Unlabeled Data for Multiclass Text Categorization," Proc. 19th Int'l Conf. Machine Learning, 2002.
J. Kolter and M. Maloof, "Dynamic weighted majority: a new ensemble method for tracking concept drift," Third International Conference on Data Mining, (ICDM'03), pages 123–130, 2003.
B. Liu, Y. Dai, X. Li, L. W. S. , and Y. P. , "Building Text Classifiers Using Positive and Unlabeled Examples," Proceedings of the Third IEEE International Conference on Data Mining, (ICDM'03), pages 179–186, 2003.
Page Classification Using SVM," Proc. Ninth Int'l Conf. Knowledge Discovery and Data Mining, 2003.
R. Klinkenberg, "Learning drifting concepts: example selection vs. example weighting," Intelligent Data Analysis, pages 281–300, 2004.
B. Liu, X. Li, L. W. S. , and Y. P. , "Text Classification by Labeling Words," Proceedings of Nineteeth National Conference on Artificial Intellgience (AAAI-2004), pages 425–430, 2004.
X. Zhu, X. Wu, and Y. Yang, "Dynamic classifier selection for effective mining from noisy data streams," Proceedings of the 4th international conference on Data Mining, (ICDM'04), pages 305–312, 2004.
Symposium on Computer-Based Medical Systems, (CBMS'06), pages 679–684, 2006.
S. Wu, C. Yang, and J. Zhou, "Clustering-training for data stream mining," Sixth IEEE International Conference of Data Mining Workshops, pages 653–656, 2006.
Y. Zhang and X. Jin, "An automatic construction and organization strategy for ensemble learning on data streams," ACM SIGMOD Record, vol. 3, pages 28–33, 2006.
S. Huang and Y. Dong, "An active learning system for mining time-changing data streams," Intelligent Data Analysis, vol. 4, pages 401–419, 2007.
X. Zhu, P. Zhang, X. Lin, and S. Y. , "Active Learning from Data Streams," Proceedings of the Sixth International Conference on Data Mining, (ICDM'06), 2007.
X. Jeffrey member, "Text classification without negative examples revist," IEEE computer society 2008.
Z. Zhang Yang, "One-class classification of text streams with concept drift," University of Queensland Australia, 2008.
Z. Jiawei Han and Micheline Kamber, "Data mining concepts and techniques," third edition, 2010.
Arun K Pujari, "Data mining & techniques," second edition, Universities Press, 2011.
Ning Zhong, Yuefeng Li, "Effective Pattern Discovery for Text Mining," IEEE Transactions on Knowledge and data engineering vol. 24, No. 1 January 2012.

Index Terms

Computer Science

Information Sciences

Keywords

Text mining text categorization partially supervised learning labelling unlabelled data pattern mining information filtering