CFP last date
22 April 2024
Reseach Article

Enhancing Accuracy for Protein Prediction Secondary Structure by a New Hybrid Method

by Youcef Gheraibia, Abdelouahab Moussaoui
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 34 - Number 2
Year of Publication: 2011
Authors: Youcef Gheraibia, Abdelouahab Moussaoui
10.5120/4074-5863

Youcef Gheraibia, Abdelouahab Moussaoui . Enhancing Accuracy for Protein Prediction Secondary Structure by a New Hybrid Method. International Journal of Computer Applications. 34, 2 ( November 2011), 35-40. DOI=10.5120/4074-5863

@article{ 10.5120/4074-5863,
author = { Youcef Gheraibia, Abdelouahab Moussaoui },
title = { Enhancing Accuracy for Protein Prediction Secondary Structure by a New Hybrid Method },
journal = { International Journal of Computer Applications },
issue_date = { November 2011 },
volume = { 34 },
number = { 2 },
month = { November },
year = { 2011 },
issn = { 0975-8887 },
pages = { 35-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume34/number2/4074-5863/ },
doi = { 10.5120/4074-5863 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:20:05.580362+05:30
%A Youcef Gheraibia
%A Abdelouahab Moussaoui
%T Enhancing Accuracy for Protein Prediction Secondary Structure by a New Hybrid Method
%J International Journal of Computer Applications
%@ 0975-8887
%V 34
%N 2
%P 35-40
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Prediction of protein secondary structure is an important step on the way to spell out its three dimensional structure and its function. This paper describes a new technique for prediction of secondary structure of protein based on contemporary machine learning methodology and data mining approach. More than one method has been developed to predict the protein secondary structure from the amino acids sequence; these methods show that we can achieve accuracy up to 80%. The work in this research is consists of three parts. In the first part, the secondary structure of each amino acid is predict alone with naive bays classifier, this method is based on amino acid preferences for different secondary structure. In the second part, an evolutionary algorithm to ameliorate this prediction is used; this method is based on physicochemical properties of protein regions. In the last part, a fragments bank which contains the protein fragments frequently detected in the Protein Data Bank (PDB) was developed; this method is based on the sequence alignment of protein but with a reduced database. The results of this research shows that the proposed method is improved the best know predictive accuracy by 4.5%, and attaint 85% accuracy with different datasets.

References
  1. Baldi P, S. Brunak, P. Frasconi, G. Pollastri, and G. Soda. Bidirectional Dynamics for Protein Secondary Structure Prediction, Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI99), Stockholm, Sweden (1999)
  2. Bernstein FC, Koetzle TF, Williams GJ, Meyer Jr EF, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol;112:535-542 (1977)
  3. Chou PY, Fasman GD. "Prediction of protein conformation". Biochemistry 13 (2): 222–245 (1974).
  4. Donald Voet Judith-G Voet Biochimie. 2e édition. De boeck
  5. Hall P, Park BU, Samworth RJ. Choice of neighbor order in nearest-neighbor classification". Annals of Statistics 36 (5): 2135–2152. doi:10.1214/07-AOS537 (2008).
  6. Hua, S. J., & Sun, Z. R. A novel method of protein secondary structureprediction with high segment overlap measure: Support vector machine approach. Journal of Molecular Biology, 308(2), 397–407 (2001).
  7. Jean-Jacques Boreux, Eric Parent, Jacques Bernier Pratique du calcul bayésien; Springer ; (2004).
  8. Kabsch W, Sander C.»Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features". Biopolymers 22 (12): 2577–637. doi:10.1002/bip.360221211. PMID 6667333. (1983).
  9. Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel-Gutfreund, Y., Diekhans, M., et al. Combining local-structure, foldrecognition, and new-fold methods for protein structure prediction. Proteins, 53, 491–496 (2003).
  10. Cuff, J. A., & Barton, G. J. (2000). Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins, 40(3), 502–511.
  11. Poli, W. B. Langdon et N. F. McPhee, A Field Guide to Genetic Programming, Lulu.com, ISBN 978-1-4092-0073-4) (2008).
  12. Robert .D et Vian B. Element de biologie cellulaire. Doin, (2008).
  13. Rost B, Sander, C., Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584-599 (1993)
Index Terms

Computer Science
Information Sciences

Keywords

Protein secondary structure prediction Bays Genetic algorithm K nearest neighbor Data mining Amino acids Hybrid method Supervised learning.l