CFP last date
20 May 2024
Reseach Article

Enhancing Performance of Web Page by Removing Noises using LRU

by Anchal Garg, Bikrampal Kaur
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 103 - Number 6
Year of Publication: 2014
Authors: Anchal Garg, Bikrampal Kaur
10.5120/18079-8632

Anchal Garg, Bikrampal Kaur . Enhancing Performance of Web Page by Removing Noises using LRU. International Journal of Computer Applications. 103, 6 ( October 2014), 23-27. DOI=10.5120/18079-8632

@article{ 10.5120/18079-8632,
author = { Anchal Garg, Bikrampal Kaur },
title = { Enhancing Performance of Web Page by Removing Noises using LRU },
journal = { International Journal of Computer Applications },
issue_date = { October 2014 },
volume = { 103 },
number = { 6 },
month = { October },
year = { 2014 },
issn = { 0975-8887 },
pages = { 23-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume103/number6/18079-8632/ },
doi = { 10.5120/18079-8632 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:33:49.635805+05:30
%A Anchal Garg
%A Bikrampal Kaur
%T Enhancing Performance of Web Page by Removing Noises using LRU
%J International Journal of Computer Applications
%@ 0975-8887
%V 103
%N 6
%P 23-27
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining is the procedure of extracting or taking out the information from the huge set of data. Web Mining is an important application of data mining, which is to extract knowledge from Web data including Web documents, hyperlinks, usage logs of web sites, etc. A Web Page contains many blocks such as content blocks, copyrights, privacy notes and advertisements. These blocks like advertisements and copyrights etc. don't come under main content blocks. These blocks are known as noisy blocks or it can be said that these blocks contain noisy information. This noisy information adversely effects web data mining. Eliminating this noisy information will improve web data mining. In this paper, it will be discussed how to identify these noises and how to eliminate them to improve efficiency of web mining. There are many types of algorithms which are used in web mining i. e. Visitor method, Dom Tree. Visitor and Dom Tree both are complex and time consuming methods. We will also discuss removal of noises by using simple LRU algorithm and variants of LRU, which will result into less time consuming algorithm for web mining.

References
  1. Chaw Su Win, Mie Mie Su Thwin (2013)" Informative Content Extraction By Using Eifce" International Journal Of Scientific & Technology Research Volume 2, Issue6.
  2. Deng Cai (2003)" VIPS: a Vision-based Page Segmentation Algorithm" Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA 98052
  3. Jinbeom Kang, Jaeyoung Yang, Nonmemberand Joongmin Choi ,(2010). " Repetition-based Web Page Segmentation by Detecting Tag Patterns for Small-Screen Devices "IEEE Transactions on Consumer Electronics, Vol. 56, No. 2.
  4. Jan Zelený (2010)"Web Page Segmentation And Classification" Journal of Data and Knowledge Engineering.
  5. K. Rajkumar (2011). "Dynamic Web Page Segmentation Based on Detecting Reappearance and Layout of Tag Patterns for Small Screen Devices".
  6. Kahkashan Tabassum (2010)"A Heuristic-based Cache Replacement Policy forData Caching" IJCSTVo l. 1,Issue 2
  7. K. S. Kuppusamy(2011)," A Model for Web Page Usage Mining Based on Segmentation" International Journal of Computer Science and Information Technologies, Vol. 2 (3).
  8. Gibson D, Punera K, Tomkins A(2005). " The volume and evolution of web page templates" In: Proceedings of WWW'05. New York, NY, USA, 2005: 830-839.
  9. Lei F, Yao M, Hao Y. ( 2009) "Improve the performance of the webpage content extraction using webpage segmentation algorithm". In: Proceedings of International Forum on Computer Science-Technology and Applications. Chongqing, China, 323-325.
  10. Swe Swe Nyein (2011) "Mining Contents in Web Page Using Cosine Similarity ".
  11. Thanda Htwe, Nan Saing Moon Khan (2011), "Extracting Data Region in Web Page by Removing Noise using DOM Tree and Neural Network" 3rd international Conference on Information and Financial Engineering, IACSIT press, Singapore.
  12. Shuang Lin, Jie Chen, Zhendong Niu(2012) . "Combining a Segmentation-Like Approach And A Density-Based Approach In Content Extraction" TSINGHUA SCIENCE AND Technologyissnll1007-0214ll05/18llpp256-264 Volume 17.
Index Terms

Computer Science
Information Sciences

Keywords

Content Extraction DOM Tree LRU Web Mining