Call for Paper - August 2022 Edition
IJCA solicits original research papers for the August 2022 Edition. Last date of manuscript submission is July 20, 2022. Read More

An Efficient Preprocessing Methodology of Log File for Web Usage Mining

Print
PDF
IJCA Proceedings on National Conference on Research Issues in Image Analysis and Mining Intelligence
© 2015 by IJCA Journal
NCRIIAMI 2015 - Number 2
Year of Publication: 2015
Authors:
A. Deepa
P. Raajan

A Deepa and P Raajan. Article: An Efficient Preprocessing Methodology of Log File for Web Usage Mining. IJCA Proceedings on National Conference on Research Issues in Image Analysis and Mining Intelligence NCRIIAMI 2015(2):15-16, June 2015. Full text available. BibTeX

@article{key:article,
	author = {A. Deepa and P. Raajan},
	title = {Article: An Efficient Preprocessing Methodology of Log File for Web Usage Mining},
	journal = {IJCA Proceedings on National Conference on Research Issues in Image Analysis and Mining Intelligence},
	year = {2015},
	volume = {NCRIIAMI 2015},
	number = {2},
	pages = {15-16},
	month = {June},
	note = {Full text available}
}

Abstract

Now a day, WWW has become important and huge data storage. All users' activities will be stored in log file. The log file shows the interest on the particular website. With a wide usage of internet, the log file size is growing rapidly. Web mining is the process of extracting information from web data. The raw log file won't reveal the users' accessing pattern. Thus, preprocessing has become an important process in web mining. Web Usage Mining is the important domain area of web mining to extract and analyze the usage pattern of users from the server log file. The quality of the input decides the quality of the output. Preprocessing is the noteworthy process before mining the interesting information from data. In this paper we have implemented the preprocessing techniques to convert the log file into user sessions which are suitable for mining and reduce the size of session file by filtering the least requested pages.

References

  • Raiyani, ashwin g. , and sheetal s. Pandya. "Discovering User Identification Mining Technique for Preprocessed Web Log Data. "
  • Maideen, C. M. , & Palanivel, M. K. MS Log Cleaner: A framework to discover efficient use of web service.
  • Castellano, G. , Fanelli, A. M. , and Torsello, M. A. 2007. Log data preparation for mining web usage patterns. In IADIS International Conference Applied Computing (pp. 371-378).
  • Sumathi, C. P. , et al. , 2011. An Overview of Preprocessing Of Web Log Files For Web Usage Mining. Journal of Theoretical and Applied Information Technology, ISSN: 1992-8645.
  • Pamutha, T. , Chimphlee, S. , Kimpan, C. , and Sanguansat, P. 2012. Data Preprocessing on Web Server Log Files for Mining Users Access Patterns. International Journal of Research and Reviews in Wireless Communications (IJRRWC) Vol, 2.
  • Smith, K. A. , & Ng, A. (2003). Web page clustering using a self-organizing map of user navigation patterns. Decision Support Systems, 35(2), 245-256.
  • Ramya, C. , and Kavitha, G. 2011. An Efficient Preprocessing Methodology for Discovering Patterns and Clustering of Web Users using a Dynamic ART1 Neural Network. In Computer Networks and Intelligent Computing (pp. 198-204). Springer Berlin Heidelberg.
  • Eltahir, M. , & Dafa-Alla, A. F. (2013, August). Extracting knowledge from web server logs using web usage mining. In Computing, Electrical and Electronics Engineering (ICCEEE), 2013 International Conference on (pp. 413-417). IEEE.
  • Bhawsar, S. , Pathak, K. , Mariya, S. , & Parihar, S. Extraction of Business Rules from Web logs to Improve Web Usage Mining.
  • Langhnoja, S. G. , Barot, M. P. , & Mehta, D. B. (2013). Web Usage Mining Using Association Rule Mining on Clustered Data for Pattern Discovery. International Journal.
  • https://www. microsoft. com/technet/prodtechnol/WindowsServer2003/Library/IIS/be22e074-72f8-46da-bb7e-e27877c85bca. mspx?mfr=true
  • Common Log Format – Wikipedia: http://en. wikipedia. org/wiki/Common_Log_Format
  • A Log File Types Supported by Clickstream Intelligence, Oracle9iAS Clickstream Intelligence Administrator's Guide Release 2 (9. 0. 2) Part Number A90500-02 http://docs. oracle. com/cd/A97329_03/bi. 902/a90500/admin-05. htm
  • Markov, Z. , & Larose, D. T. (2007). Data mining the Web: uncovering patterns in Web content, structure, and usage. John Wiley & Sons.