CFP last date
22 April 2024
Reseach Article

A Genetic Algorithm Approach for Non-Ignorable Missing Data

by R.Devi Priya, S.Kuppuswami, S.Makesh Kumar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 20 - Number 4
Year of Publication: 2011
Authors: R.Devi Priya, S.Kuppuswami, S.Makesh Kumar
10.5120/2419-3237

R.Devi Priya, S.Kuppuswami, S.Makesh Kumar . A Genetic Algorithm Approach for Non-Ignorable Missing Data. International Journal of Computer Applications. 20, 4 ( April 2011), 37-41. DOI=10.5120/2419-3237

@article{ 10.5120/2419-3237,
author = { R.Devi Priya, S.Kuppuswami, S.Makesh Kumar },
title = { A Genetic Algorithm Approach for Non-Ignorable Missing Data },
journal = { International Journal of Computer Applications },
issue_date = { April 2011 },
volume = { 20 },
number = { 4 },
month = { April },
year = { 2011 },
issn = { 0975-8887 },
pages = { 37-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume20/number4/2419-3237/ },
doi = { 10.5120/2419-3237 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:06:56.023291+05:30
%A R.Devi Priya
%A S.Kuppuswami
%A S.Makesh Kumar
%T A Genetic Algorithm Approach for Non-Ignorable Missing Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 20
%N 4
%P 37-41
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The databases store data that may be subjected to missing values either in data acquisition or data storage process. The proposed approach uses the widely used optimization technique called genetic algorithm for the NMAR (Not Missing At Random) missing mechanism which prevails more in real life that are non-ignorable. Since the non-ignorable mechanism needs prior basic knowledge about the data that is supposed to be missing and have to make assumptions, Genetic algorithm (GA) suits well for this problem which derives solution based on the previously observed data. The empirical results show that Genetic Algorithm has better efficiency when compared with some of the traditional methods.

References
  1. R. J. A. Little & D. B. Rubin, Statistical analysis with missing data – Second edition, Wiley - Interscience, New Jersey, 2002.
  2. Batista, G. and Monard, M.C. (2003). “An Analysis of Four Missing Data Treatment Methods for Supervised Learning”, Applied Artificial Intelligence, 17, pp. 519-533.
  3. Rubin, D.B . (1996). “Multiple Imputation After 18+ Years”, Journal of the American Statistical Association, 91, pp. 473- 489.
  4. Graham JW, Cumsille PE, Elek-Fisk E. 2003. Methods for handling missing data. In Research Methods in Psychology, ed. JA Schinka, WF Velicer, pp. 87–114.Volume 2 of Handbook of Psychology, ed. IB Weiner.New York:Wiley.
  5. Yang X, Shoptaw S. Assessing Missing Data Assumptions in Longitudinal Studies: An Example Using a Smoking Cessation Trial. Drug and Alcohol Dependence, 77, 213-225, 2005.
  6. Z. Michalewicz. Genetic Algorithm + Data Structures = Evolution Programs. Berling Heidelb NY: Springer- Verlag. third ed,1996.
  7. S. Forrest. “Genetic algorithms.” ACM Compur, Sum., vol. 28, no.1.pp 77-80 1996.
  8. W, Banzhaf, P. Nordin, R. Keller, and E Francone Genetic Programming- on the automatic evolution of computer program and Its applications. Califomia: Morgan Kaufmmn Publishers, fifth ed., 1998.
  9. J.R. Quinlan, “ Unknown Attribute values in Induction,” Proc. Sixth Int’l Workshop Machine Learning, pp. 164-168, 1989.
  10. Beunckens, C., Molenberghs, G., Verbeke, G., and Mallinckrodt, (2008). A latent- class mixture model for incomplete longitudinal Guassian data. Biometrics, 64, 96- 105.
  11. Rubin, D.B. (1987). Multiple Imputations for Non – response in Surveys. New York: John Wiley and Sons.
  12. McKnight, P.E. et al. (2007) Missing Data: A Gentle Introduction, Guilford Press.
  13. Schafer, J.L . AND Graham, J.W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7 (2), 147-177.
  14. Little, R.J. (2009). Selection and pattern – mixture models. In Fitzmaurice, G., Davidian, M, Verbeke, G. & Molenberghs, G42 (eds.), Longitudinal Data Analysis , pp. 409-431. Boca Raton: Chapman & Hall/CRC Press.
  15. Albert, P.S. & Follman, D.A. (2009). Shared – parameter models.
  16. David E. Goldberg (2005) Genetic Algorithms in Search, Optimization, and Machine Learning.
  17. Demirtas & Schafer 2003 “ On the performance of random- coefficient pattern- mixture models for non - ignorable drop - outs”. Statistics in Medicine 22, 2553-2575.
  18. www.multiple-imputation.com
  19. UCI repository www. ics. uci. edu/ mlearn/ MLRepository. html
Index Terms

Computer Science
Information Sciences

Keywords

Data acquisition Missing data NMAR Non-response Genetic algorithm Optimization