CFP last date
20 May 2024
Reseach Article

Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction

by Manpreet Singh, Gurvinder Singh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 20 - Number 3
Year of Publication: 2011
Authors: Manpreet Singh, Gurvinder Singh
10.5120/2414-3226

Manpreet Singh, Gurvinder Singh . Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction. International Journal of Computer Applications. 20, 3 ( April 2011), 22-27. DOI=10.5120/2414-3226

@article{ 10.5120/2414-3226,
author = { Manpreet Singh, Gurvinder Singh },
title = { Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction },
journal = { International Journal of Computer Applications },
issue_date = { April 2011 },
volume = { 20 },
number = { 3 },
month = { April },
year = { 2011 },
issn = { 0975-8887 },
pages = { 22-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume20/number3/2414-3226/ },
doi = { 10.5120/2414-3226 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:06:49.358686+05:30
%A Manpreet Singh
%A Gurvinder Singh
%T Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction
%J International Journal of Computer Applications
%@ 0975-8887
%V 20
%N 3
%P 22-27
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In the present paper, the cluster analysis as a form of unsupervised learning is implemented for human protein class prediction. The data related to human protein is accessed from Human Protein Reference Database (HPRD). From HPRD, the sequences related to ten molecular classes are obtained. For each of the molecular class five amino acid sequences are obtained. Then with the help of various web based tools, SDFs (Sequence derived Features) are extracted for each sequence. By analyzing the variation in the values of the obtained SDFs, priorities are assigned to them. Because each sequence has some value for each of the SDF, so obtained data is a complete weighted bipartite graph consisting of two independent set of nodes i.e. one set of all the sequences and second of all SDFs. Then bipartite graph is represented into the memory with adjacency weight matrix. On the basis of values of input SDFs and by considering priority of each of the SDF, clusters of the data available in the adjacency matrix are generated. Then those clusters are backtracked to predict the class of the entered sequence.

References
  1. Friedberg I. 2006. “Automated Protein Function Prediction-the genomic challenge”, Briefings in Bioinformatics, Vol. 7, No. 3.January 2006, pp. 225-242.
  2. Krane, D. and Raymer, M. 2006. Fundamental Concepts of Bioinformatics, Pearson Education, New Delhi.
  3. Singh Manpreet, Singh Parvinder and Wadhwa Parminder Kaur, 2007. “Human Protein Function Prediction using Decision Tree Induction”, International Journal of Computer Science and Network Security, Vol. 7, No. 4, pp. 92-98.
  4. Singh Manpreet, Wadhwa P.K., Kaur Surinder, 2008. “Predicting Protein Function using Decision Tree” World Academy of Science, Engineering and Technology, issue 39, pp. 350-353.
  5. Kaur Reet Kamal, Kaur Manjot, Kaur Amanjot. 2010. "Using Cluster Analysis for Protein Secondary Structure Prediction" International Journal of Computer Applications, Vol. 4, No. 12, August 2010, pp. 20-22.
  6. Singh Manpreet, Singh Gurvinder and Kahlon Karanjeet Singh, 2009. “Analyzing the Cluster for Protein Sequence Alignment”, PCTE Journal of Computer Sciences, Vol. 6, issue 1, 2009, pp. 74-83.
  7. Human Protein Reference Database (HPRD) http://www.hprd.org/moleculeClass
  8. Jensen L. 2002. Prediction of Protein Function from Sequence Derived Protein Features, Ph.D. thesis, Technical University of Denmark.
  9. Jensen L., Skovgaard M. and Brunak S. 2002. “Prediction of Novel Archaeal Enzymes from Sequence Derived Features”, Protein Science, Vol. 11, pp. 2894-2898.
  10. Jensen L.J., Gupta R., Blom N., Devos D., Tamames J., Kesmir C., Nielsen H., Stærfeldt H.H., Rapacki K., Workman C., Andersen C.A.F., Knudsen S., Krogh A., Valencia A. and Brunak S. 2002. “Prediction of Human Protein Function from Post-Translational Modifications and Localization Features”, Journal of Molecular Biology, Vol. 319(5), pp. 1257-1265.
  11. Charu C. A., Haixun W. 2010. Managing and Mining Graph Data, Springer.
Index Terms

Computer Science
Information Sciences

Keywords

Protein class prediction cluster analysis bipartite graph