![]() |
10.5120/2419-3237 |
R.Devi Priya, S.Kuppuswami and S.Makesh Kumar. Article: A Genetic Algorithm Approach for Non-Ignorable Missing Data. International Journal of Computer Applications 20(4):37-41, April 2011. Full text available. BibTeX
@article{key:article, author = {R.Devi Priya and S.Kuppuswami and S.Makesh Kumar}, title = {Article: A Genetic Algorithm Approach for Non-Ignorable Missing Data}, journal = {International Journal of Computer Applications}, year = {2011}, volume = {20}, number = {4}, pages = {37-41}, month = {April}, note = {Full text available} }
Abstract
The databases store data that may be subjected to missing values either in data acquisition or data storage process. The proposed approach uses the widely used optimization technique called genetic algorithm for the NMAR (Not Missing At Random) missing mechanism which prevails more in real life that are non-ignorable. Since the non-ignorable mechanism needs prior basic knowledge about the data that is supposed to be missing and have to make assumptions, Genetic algorithm (GA) suits well for this problem which derives solution based on the previously observed data. The empirical results show that Genetic Algorithm has better efficiency when compared with some of the traditional methods.
Reference
- R. J. A. Little & D. B. Rubin, Statistical analysis with missing data – Second edition, Wiley - Interscience, New Jersey, 2002.
- Batista, G. and Monard, M.C. (2003). “An Analysis of Four Missing Data Treatment Methods for Supervised Learning”, Applied Artificial Intelligence, 17, pp. 519-533.
- Rubin, D.B . (1996). “Multiple Imputation After 18+ Years”, Journal of the American Statistical Association, 91, pp. 473- 489.
- Graham JW, Cumsille PE, Elek-Fisk E. 2003. Methods for handling missing data. In Research Methods in Psychology, ed. JA Schinka, WF Velicer, pp. 87–114.Volume 2 of Handbook of Psychology, ed. IB Weiner.New York:Wiley.
- Yang X, Shoptaw S. Assessing Missing Data Assumptions in Longitudinal Studies: An Example Using a Smoking Cessation Trial. Drug and Alcohol Dependence, 77, 213-225, 2005.
- Z. Michalewicz. Genetic Algorithm + Data Structures = Evolution Programs. Berling Heidelb NY: Springer- Verlag. third ed,1996.
- S. Forrest. “Genetic algorithms.” ACM Compur, Sum., vol. 28, no.1.pp 77-80 1996.
- W, Banzhaf, P. Nordin, R. Keller, and E Francone Genetic Programming- on the automatic evolution of computer program and Its applications. Califomia: Morgan Kaufmmn Publishers, fifth ed., 1998.
- J.R. Quinlan, “ Unknown Attribute values in Induction,” Proc. Sixth Int’l Workshop Machine Learning, pp. 164-168, 1989.
- Beunckens, C., Molenberghs, G., Verbeke, G., and Mallinckrodt, (2008). A latent- class mixture model for incomplete longitudinal Guassian data. Biometrics, 64, 96- 105.
- Rubin, D.B. (1987). Multiple Imputations for Non – response in Surveys. New York: John Wiley and Sons.
- McKnight, P.E. et al. (2007) Missing Data: A Gentle Introduction, Guilford Press.
- Schafer, J.L . AND Graham, J.W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7 (2), 147-177.
- Little, R.J. (2009). Selection and pattern – mixture models. In Fitzmaurice, G., Davidian, M, Verbeke, G. & Molenberghs, G42 (eds.), Longitudinal Data Analysis , pp. 409-431. Boca Raton: Chapman & Hall/CRC Press.
- Albert, P.S. & Follman, D.A. (2009). Shared – parameter models.
- David E. Goldberg (2005) Genetic Algorithms in Search, Optimization, and Machine Learning.
- Demirtas & Schafer 2003 “ On the performance of random- coefficient pattern- mixture models for non - ignorable drop - outs”. Statistics in Medicine 22, 2553-2575.
- www.multiple-imputation.com
- UCI repository www. ics. uci. edu/ mlearn/ MLRepository. html