CFP last date
22 April 2024
Reseach Article

K-Harmonic Means Granular Computing Model for Protein Sequence Motif Identification

Published on February 2013 by M Chitralegha, K Thangavel
International Conference on Communication, Computing and Information Technology
Foundation of Computer Science USA
ICCCMIT - Number 1
February 2013
Authors: M Chitralegha, K Thangavel
6637b3ce-0c6e-4d1e-842e-fb4e6287ccc1

M Chitralegha, K Thangavel . K-Harmonic Means Granular Computing Model for Protein Sequence Motif Identification. International Conference on Communication, Computing and Information Technology. ICCCMIT, 1 (February 2013), 17-23.

@article{
author = { M Chitralegha, K Thangavel },
title = { K-Harmonic Means Granular Computing Model for Protein Sequence Motif Identification },
journal = { International Conference on Communication, Computing and Information Technology },
issue_date = { February 2013 },
volume = { ICCCMIT },
number = { 1 },
month = { February },
year = { 2013 },
issn = 0975-8887,
pages = { 17-23 },
numpages = 7,
url = { /specialissues/icccmit/number1/10325-1007/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Special Issue Article
%1 International Conference on Communication, Computing and Information Technology
%A M Chitralegha
%A K Thangavel
%T K-Harmonic Means Granular Computing Model for Protein Sequence Motif Identification
%J International Conference on Communication, Computing and Information Technology
%@ 0975-8887
%V ICCCMIT
%N 1
%P 17-23
%D 2013
%I International Journal of Computer Applications
Abstract

Bioinformatics is concerned with creation and advancement of algorithms using techniques such as computational intelligence, applied mathematics and statistics to solve biological problems. Sequence analysis, protein structure alignment analysis and prediction, gene finding are said to be major research efforts done in the area of bioinformatics. Proteins are considered as one of the most important elements in the process of life. The activities and functions of proteins can be determined by protein sequence motifs. Identifying such motifs is one of the crucial tasks in the area of bioinformatics. In this study, Singular Value Decomposition (SVD) is adopted to select significant sequence segments and then K-Harmonic Means granular computing model is proposed to generate protein sequence motif information efficiently. Experimental result shows that K-Harmonic granular computing model outperforms K-Means granular technique.

References
  1. O. Alter, P. O Brown, D. Botstein, "Singular value decomposition for genome-wide expression data preprocessing and modeling", PNAS, Vol. 97, No. 18, pp. 10101-10106, 2000.
  2. T K Attwood, M E Beck, A J Bleasby, K. Degtyarenko, DJP. Smityh: Progress with the PRINTS protein fingerprint database. Nucleic Acids Res 1996, 24:182-183.
  3. B. Chen, P. C Tai, R. Harrision and Y. Pan, "FIK Model: Novel Efficient Granular Computing Model for Protein Sequence Motifs and Structure Information Discovery", in IEEE proc, 6th symposium on Bioinformatics and Bio Engineering (BIBE), Washington DC, 2006, pp. 20-26.
  4. B. Chen, P. C Tai, R. Harrison and Y. Pan, "FGK Model: An Efficient Granular Computing Model for Protein Sequence Motifs Information Discovery", in IASTED proc. International conference on Computational and Systems Biology (CASB), Dallas 2006, pp. 56-61.
  5. D. L Davies, and D. W Buldin, "A cluster separation measure", IEEE Trans. Pattern Recogn. Machine Intell. , 1,224-227, 1979.
  6. David W. Mount, Sequence and Genome Analysis, Cold Spring Harbor Laboratory Press, New York, 2001.
  7. K. F Han and D. Baker, "Recurring local sequence motifs in proteins", J. Mol. Bio, Vol. 251, No. 1, pp. 176-187, 1995.
  8. S. Henikoff, J. G. Henikoff and S. Pietrokovski, "Blocks+: a non redundant database of protein Alignment blocks derived from multiple compilation", Bioinformatics, Vol. 15, No. 6, pp. 417-479, 1999.
  9. N. Hullo, C. J. A Sigrist, V. Le Saux, P. S Langendijk-Genevaux, L. Bordoli, A. Gattiker, E. De Castro, P. Bucher, and A. Bairoch, "Recent improvements to the PROSITE database", Nucleic Acids Res, Vol. 32, Database issue: D134-137, 2004.
  10. W. Kabsch and C. Sander, "Dictionary of protein secondary structure pattern recognition of hydrogen-bonded and geometrical features", Biopolymers, Vol. 22, pp. 2577-2637, 1983.
  11. Margaret H. Dunham, Data Mining- Introductory and Advanced Concepts, Pearson Education, 2006.
  12. . C. Sander and R. Schneider, "Database of Homology-derived protein structures and the structural meaning of sequence alignment", Proteins: Struct. Funct. Genet. , vol. 9, No. 1, pp. 56-68, 1991.
  13. C. Sander and R. Schneider, "Database of similarity derived protein structures and the structural meaning of sequence alignment, "Proteins: Struct. Funct. Gent. Vol. 9, No. 1, pp. 56-68, 1991.
  14. G. Wang and R. L Dunbrack,Jr. , "PISCES: a protein sequence culling server", Bioinformatics,Vol. 19, No. 12, pp. 1589-1591,2003.
  15. W. Zhong, G. Altun, R. Harrison, P. C Tai and Yi Pan, "Improved K-Means Clustering algorithm for Exploring Local Protein Sequence motifs Representing Common Structural Property", IEEE transactions on Nanobioscience, Vol. 4, No. 3, pp. 255-265, 2005.
  16. B. Zhang, M. Hsu, U. Dayal, K-Harmonic means- a data clustering algorithm, Technical report HPL-1999-124. Hewlett Packard laboratories, 1999.
Index Terms

Computer Science
Information Sciences

Keywords

Protein Sequence Motif Clustering Hssp-blosum62 Svd