CFP last date
20 May 2024
Reseach Article

Biclustering of Gene Expression Data using a Two - Phase Method

by Madhuleena Das, Bhogeswar Borah
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 103 - Number 13
Year of Publication: 2014
Authors: Madhuleena Das, Bhogeswar Borah
10.5120/18132-9232

Madhuleena Das, Bhogeswar Borah . Biclustering of Gene Expression Data using a Two - Phase Method. International Journal of Computer Applications. 103, 13 ( October 2014), 6-10. DOI=10.5120/18132-9232

@article{ 10.5120/18132-9232,
author = { Madhuleena Das, Bhogeswar Borah },
title = { Biclustering of Gene Expression Data using a Two - Phase Method },
journal = { International Journal of Computer Applications },
issue_date = { October 2014 },
volume = { 103 },
number = { 13 },
month = { October },
year = { 2014 },
issn = { 0975-8887 },
pages = { 6-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume103/number13/18132-9232/ },
doi = { 10.5120/18132-9232 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:34:26.458923+05:30
%A Madhuleena Das
%A Bhogeswar Borah
%T Biclustering of Gene Expression Data using a Two - Phase Method
%J International Journal of Computer Applications
%@ 0975-8887
%V 103
%N 13
%P 6-10
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. Many biclustering algorithms optimize a mean squared residue to discover biclusters from a gene expression dataset. In this paper a Two-Phase method of finding a bicluster is developed. In the first phase, a modified version of k-means algorithm is applied to the gene expression data to generate k clusters. In the second phase, an iterative search is performed to check the possibility of removing more genes and conditions within the given threshold value of mean squared residue score. Experimental results on yeast dataset show that our approach can effectively find high quality biclusters

References
  1. Shyama Das, Sumam and Mary Idicula, " Application of Greedy Randomized Adaptive Search Procedure to the Biclustering of Gene Expression Data ", International Journal of Computer Applications, Volume 2 – No. 3, pp. 0975-8887, 2010.
  2. S. C. Madeira and A. L. Oliveira, "Biclustering Algorithms for Biological Data Analysis: A Survey", IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 24-45, 2004.
  3. Doruk Bozdag, Ashwin S. Kumar and Umit V. Catalyurek, Comparative Analysis of Biclustering Algorithms, 2010.
  4. S. Bergmann, J. Ihmels, N. Barkai, "Iterative Signature Algorithm for the Analysis of Large-scale Gene Expression Data", Phys Rev E Stat Nonlin SoftMatter Phys, 67(3), 031902, 2003.
  5. A. Prelic, S. Bleuler, P. Zimmerman and E. Zitzler, "A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data", Bioinformatics 22(9), 1122-1129, 2006.
  6. J. A. Hartigan, "Clustering Algorithms", New York: John Willey and Sons, Inc, 1975.
  7. A. Tanay, R. Sharan and R. Shamir, "Discovering Statistically Significant Biclusters in Gene Expression Data, Bioinformatics", 18, 136S-144, 2002.
  8. Y. Cheng and G. Church, "Biclustering of Expression Data", "Int'l Conf. " on Intelligent Systems for Molecular Biology, 93-103, 2000.
  9. S. Busygin, G. jacobsen, and E. Kramer, "Double Conjugated Clustering Applied to Leukemia Microarray Data" Proc. Second SIAM Int'l Conf. Data Mining, Workshop Clustering High Dimensional Data, 2002.
  10. A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini, "Discovering Local Structure in Gene Expression Data: The Order-Preserving Sub-Matrix Problem", Proc. Of the 6th Ann. Int'l Conf on Computational Biology, 1-58113-498-3, 49-57,2002.
  11. L. Lazzeroni and A. Owen, "Plaid Models for Gene Expression Data", technical report, Stanford Univ. , 2000.
  12. SGD GO Termfinder [http://db. yeastgenome. org/cgi-bin/GO/goTermFinder]
Index Terms

Computer Science
Information Sciences

Keywords

Gene expression data data mining clustering biclustering.