Development of Cluster based Supervised Learning Technique for Web News Extraction

Pardeep Kaur; Rekha Bhatia

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

CAD-Genesis: An Open-Source AI-Powered Add-in for Natural Language-Driven Parametric CAD Modeling and Cross-Platform Integration in SolidWorks and Fusion 360

Anil Mandloi Prakhi Mandloi

Random Articles

Interactive Multiresolution Visualization of 3D Mesh

April

2013

A Wind Turbine System Model using a Doubly-Fed Induction Generator (DFIG)

March

2014

Modelinig and Simulation of Amino Acide

May

2016

Development of Algorithm for Identification of Area for Maximum Coverage and Interference

Sep

2017

Reseach Article

Development of Cluster based Supervised Learning Technique for Web News Extraction

by Pardeep Kaur, Rekha Bhatia

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 152 - Number 5

Year of Publication: 2016

Authors: Pardeep Kaur, Rekha Bhatia

10.5120/ijca2016911805

Pardeep Kaur, Rekha Bhatia . Development of Cluster based Supervised Learning Technique for Web News Extraction. International Journal of Computer Applications. 152, 5 ( Oct 2016), 30-31. DOI=10.5120/ijca2016911805

@article{ 10.5120/ijca2016911805,

author = { Pardeep Kaur, Rekha Bhatia },

title = { Development of Cluster based Supervised Learning Technique for Web News Extraction },

journal = { International Journal of Computer Applications },

issue_date = { Oct 2016 },

volume = { 152 },

number = { 5 },

month = { Oct },

year = { 2016 },

issn = { 0975-8887 },

pages = { 30-31 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume152/number5/26317-2016911805/ },

doi = { 10.5120/ijca2016911805 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:57:23.502448+05:30

%A Pardeep Kaur

%A Rekha Bhatia

%T Development of Cluster based Supervised Learning Technique for Web News Extraction

%J International Journal of Computer Applications

%@ 0975-8887

%V 152

%N 5

%P 30-31

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

World Wide Web makes it a prominent source of online information as abundance of data is available on the web and lots of data gets uploaded on daily basis. Due to the presence of massive information on the web it seems easier and simpler to get any information at any time effortlessly, but it requires a lot of focus. Numerous web mining techniques have been studied like extractors, wrappers etc, that provide various methods to extract useful web content. In this paper a semi-supervised web news extraction technique is proposed that uses unsupervised clustering technique and supervised classification technique.

References

Zhong Ji, Member, Yanwei Pang, Senior Member, and Xuelong Li, “Relevance Preserving Projection and Ranking for Web Image Search Reranking”, VOL. 24, NO. 11, NOVEMBER 2015.
Debina Laishram and Merin Sebastian,“Extraction of web news from web pages using a ternary tree approach,” IEEE Second International Conference on Advances in Computing and Communication Engineering,, pp. 628-633, 2015.
Shanchan Wu, Jerry Liu, Jian Fan, “Automatic Web Content Extraction by Combination of Learning and Grouping,” International World Wide Web Conference Committee (IW3C2), pp. 1264-1274, WWW 2015, May 18-22, 2015, Florence, Italy.
Yan Guo et al, “ECON: An Approach to Extract Content from Web News Page,” IEEE 12th International Asia-Pacific Web Conference, 2010, pp. 314-320
Yongquan Dong1,Qingzhon Li1,Zhongmin Yan1 and Yanhui Ding,” A Generic Web News Extraction Approach,” Proceedings of the 2008 IEEE, International Conference on Information and Automatio, Zhangjiajie, China,June 20-23,2008.
M. Wook, Y. H. Yahaya, N. Wahab, M. R. M. Isa, N. F. Awang, and H. Y. Seong, (2009) “Predicting NDUM student‟s academic performance using data mining techniques,” in Proc. 2009 Second Int. Conf. Comput. Electr. Eng., pp. 357-361.
Yung-Shen Lin et al, “A Similarity Measure for Text Classification and Clustering,” IEEE transactions on knowledge and data engineering, vol. 26, no. 7, pp. 1575-1590, July 2014.
Matthew Michelson and Craig A. Knoblock, “Unsupervised Information Extraction from Unstructured,Ungrammatical Data Sources on the World Wide Web,” International Journal of Document Analysis and Recognition (IJDAR), August 2007.
Davi de Castro Reis et al, WWW2004, New York, USA.ACM158113844X/04/0005. “Automatic Web News Extraction Using Tree Edit Distance,”May 17.22, 2004, pp. 502-511

Index Terms

Computer Science

Information Sciences

Keywords

Web Mining Web News Web News Extraction Unsupervised Machine Learning Classification