An Optimum Model for the Retrieval of Missing Values for Data Cleansing using Regression Analysis

Deepshikha Aggarwal; V. B. Aggarwal

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2026

Submit your paper

Know more

The week's pick

AI-Assisted Observability in Distributed Microservice Architectures

Kyrylo Sotnykov

Random Articles

An Evaluation of Network Topologies for Enhance Networking

Jun

2023

Semantic Web Application in Learning Resource Ontology Repository

April

2016

FRANSAC: Fast RANdom Sample Consensus for 3D Plane Segmentation

Jun

2017

Recommender Systems for Software Requirements Negotiation and Prioritization

May

2015

Reseach Article

An Optimum Model for the Retrieval of Missing Values for Data Cleansing using Regression Analysis

by Deepshikha Aggarwal, V. B. Aggarwal

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 117 - Number 2

Year of Publication: 2015

Authors: Deepshikha Aggarwal, V. B. Aggarwal

10.5120/20529-2869

Deepshikha Aggarwal, V. B. Aggarwal . An Optimum Model for the Retrieval of Missing Values for Data Cleansing using Regression Analysis. International Journal of Computer Applications. 117, 2 ( May 2015), 35-39. DOI=10.5120/20529-2869

@article{ 10.5120/20529-2869,

author = { Deepshikha Aggarwal, V. B. Aggarwal },

title = { An Optimum Model for the Retrieval of Missing Values for Data Cleansing using Regression Analysis },

journal = { International Journal of Computer Applications },

issue_date = { May 2015 },

volume = { 117 },

number = { 2 },

month = { May },

year = { 2015 },

issn = { 0975-8887 },

pages = { 35-39 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume117/number2/20529-2869/ },

doi = { 10.5120/20529-2869 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:58:17.564998+05:30

%A Deepshikha Aggarwal

%A V. B. Aggarwal

%T An Optimum Model for the Retrieval of Missing Values for Data Cleansing using Regression Analysis

%J International Journal of Computer Applications

%@ 0975-8887

%V 117

%N 2

%P 35-39

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

An important aspect of the data mining is the pre-processing of the data. Pre-processing of the data is important because real world data is susceptible to inconsistencies, noise and missing values. Such a data cannot be used in data mining as that would produce highly inadequate results . There are basically two methods through which we can remove the problem of the missing values the first one is to ignore the data set with the missing value the second one is to predict those values. Prediction can be made based on assuming the continuity of the data or giving them some suitable value based on previous knowledge . In this paper our focus is on providing an adequate method to fill those missing values by predicting a suitable value by comparing and choosing a suitable regression method based on both the statistical and the subjective analysis of the graph from the various known regression method.

References

"A Data Cleaning Method Based on Association Rules" by Weijie Wei, Mingwei Zhang, Bin Zhang, www. atlantis-press. com
"Data Cleansing for Web Information Retrieval using Query Independent Features" by Yiqun Liu, Min Zhang, Liyun Ru, Shaoping Ma- www. thuir. cn
"An Extensive Framework for Data Cleaning " by Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon
"A Token-Based Data Cleaning Technique for Data Warehouse" by Timothy E. Ohanekwu International Journal of Data Warehousing and Mining Volume 1
Surajit Chaudhuri Kris Ganjam Venkatesh Ganti Rajeev Motwani, SIGMOD 2003, June 9-12, 2003, San DiegoCA. "Robust and Efficient Fuzzy Match for Online Data Cleaning"
Christie I. Ezeife, Timothy E. Ohanekwu, University of Windsor, Canada, International Journal of Data Warehousing & Mining, 1(2), 1-22, April-June 2005 Research paper titled "Use of Smart Tokens in Cleaning Integrated Warehouse Data"
Ajumobi Udechukwu, Christie Ezeife, Ken Barker Dept. of Computer Science, University of Calgary, Canada School of Computer Science, University of Windsor, Canada, 5th International Conference on Enterprise Information Systems (ICEIS) 2003, Research paper titled "INDEPENDENT DE-DUPLICATION IN DATA CLEANING"
G. Siva Nageswara Rao, Dr. K. Krishna Murthy, Dr. B. V. Subba Rao, Dr. J. Rajendra Prasad, International Journal of Emerging Technology and Advanced Engineering Website: www. ijetae. com (ISSN 2250-2459, Volume 2, Issue 3, March 2012) research paper titled "Removing Inconsistencies and Errors from Original Data Sets through Data Cleansing"
Kazi Shah Nawaz Ripon Department of Informatics, University of Oslo, Norway Computer Science and Engineering Discipline, Khulna University, Bangladesh Ashiqur Rahman and G. M. Atiqur Rahaman Computer Science and Engineering Discipline, Khulna University, Bangladesh, JOURNAL OF COMPUTERS, VOL. 5, NO. 12, DECEMBER 2010 research paper titled "A Domain-Independent Data Cleaning Algorithm for Detecting Similar-Duplicates"
"The role of visualization in effective data cleaning" by Yu Qian, Kang Zhang – Proceedings of 2005 ACM symposium on applied computing
"A Statistical Method for Integrating Data Cleaning and Imputation" by Chris Mayfield, Jennifer Neville, Sunil Prabahakar- Purdue University(Computer Science report-2009)
"Data cleansing based on mathematical morphology" by Sheng Tang published in ICBBE 2008 The second International Conference-2008
"A Domain Independent Data Cleaning Algorithm for detecting similar-duplicates" by Kazi Shah Nawaz Ripon, Ashquir Rahman and G. M. Atiqur Rahaman – Journal of Computer Vol 5, No. 12,2010
P. Pehwa "An Efficient Algorithm for Data Cleaning" www. igiglobal. com -2011.
"Attribute Correction-Data cleaning using Association Rule and Clustering Methods" by R. KavithaKumar, Dr. RM. Chandrasekaran, IJDKP,Vol. 1,No. 2 March-2011.
Random Forest Based Imbalanced Data Cleaning and Classification – Jie Gu –Lamda. nju. edu. cn
Data Cleansing Based on Mathematical Morphology S. Tang-2008 –ieeeexplore. ieee. org. Bioinformatics and Biomedical Engineering , 2008 ICBBE 2008. The 2nd International conference.
"An efficient Algorithm for Data Cleaning of Log File using File Extension" International journal of Computer Applications 48(8):13-18, June-2012 Surabhi Anand , Rinkle Rani Aggarwal.
A New Efficient Data Cleansing Method – Li Zhao, Sung Sam Yuan, Sun Peng and Ling Tok Wang – ftp10. us. freebsd. org
Computer Research and Development (ICCRD), 2011, 3rd International Conference. ", Web log cleaning for mining of web usage patterns" –T. T. Aye.
"Mass Data Cleaning Algorithm based on extended tree-like knowledge base" – Yan Cai-rong,SUN Gui-ning , GAO Nian-gao Computer Engineering and application 2009

Index Terms

Computer Science

Information Sciences

Keywords

Data Quality Missing Values Data Cleaning Regression Linear Quadratic Exponential Gaussian Prediction RMSE.