CFP last date
22 April 2024
Reseach Article

Fast Longest Common Subsequences for Bioinformatics Dynamic Programming

by Arabi E. Keshk, Mohammed Ossman, Lamiaa Fathi Hussein
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 57 - Number 22
Year of Publication: 2012
Authors: Arabi E. Keshk, Mohammed Ossman, Lamiaa Fathi Hussein
10.5120/9419-3569

Arabi E. Keshk, Mohammed Ossman, Lamiaa Fathi Hussein . Fast Longest Common Subsequences for Bioinformatics Dynamic Programming. International Journal of Computer Applications. 57, 22 ( November 2012), 12-18. DOI=10.5120/9419-3569

@article{ 10.5120/9419-3569,
author = { Arabi E. Keshk, Mohammed Ossman, Lamiaa Fathi Hussein },
title = { Fast Longest Common Subsequences for Bioinformatics Dynamic Programming },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 57 },
number = { 22 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 12-18 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume57/number22/9419-3569/ },
doi = { 10.5120/9419-3569 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:01:06.359298+05:30
%A Arabi E. Keshk
%A Mohammed Ossman
%A Lamiaa Fathi Hussein
%T Fast Longest Common Subsequences for Bioinformatics Dynamic Programming
%J International Journal of Computer Applications
%@ 0975-8887
%V 57
%N 22
%P 12-18
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Bioinformatics applications represent an increasingly important workload to improve the programs of sequence analysis. It can be used to assign function to genes and proteins by the study of the similarities between the compared sequences. This paper introduces a modified implementation of bioinformatics algorithm for sequence alignment . The implemented algorithm is called Fast Longest Common Subsequence (FLCS). It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution but the execution time is decreased and the performance is high. To illustrate the effectiveness of optimizing the performance of the proposed FLCS algorithm and demonstrate its superiority, it is compared with Needleman-Wunsch, Smith-Waterman and Longest Common Subsequence algorithms.

References
  1. Dimitris Papamichail and Georgios Papamichail2,"Improved algorithms for approximate string matching (extended abstract)"BMC Bioinformatics 2009.
  2. Wagner, R. A. and Fischer, M. J. (1974). "The string -to-string correction problem". Journal of the ACM 21 (1) , 1974: 168–173.
  3. Moulton, V. , Singl, M. ALGORITHMS IN BIOINFORMATICS, 10thInternational workshop, WABI , Proceedings 2010, 20-22.
  4. Tahir Naveed, Imitaz Siddiqui, Shaftab Ahmed, "Parallel Needleman-Wunsch Algorithm for Grid", Proceedings of the PAK-US International Symposium on High Capacity Optical Networks and Enabling Technologies. Islamabad, Pakistan, Dec 19 -21, 2005.
  5. MacIntosh, G. C. , Wilkerson, C. , Green, P. J. (2001). Identification and analysis of analysis of Arabidopsis expressed sequence tags characteristic of noncoding RNAs. Plant Physiol. 127(3): 765-776.
  6. Casey, R. M. (2005). "BLAST Sequences Aid in Genomics and Proteomics". Business Intelligence Network . http://www. b-eye-network. com/view/1730.
  7. Lopez, C. , Piegu, B. , Cooke, R. , Delseny, M. , Tohme, J. , Verdier, V. Using cDNA and genomic sequences as tools to develop SNP strategies in cassava (Manihot esculenta Crantz) . Theor. Appl. Genet, 2005 110: 425-431. 47.
  8. Diaz, D. , Esteban, F. J. , Hamandez, P. , Caballero, J. A. , Dorado G. ,Galvez, S. (2011),Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture ,journal: Parallel computing-PC,VOL. 37,no. 4-5, pp . 244-259.
  9. Source of DNA Sequences (online),National Center Biotechnology Information. Available: http://www. ncbi. nlm. nih. gov/mapview.
  10. Bin Wang, Implementation of a dynamic programming algorithm for DNA sequences alignment on the cell Matrix Architecture (online), Utah State University, Logan, Utah. Available: http://www. cellmatrix. com/entryway/products/pub/wang 2002. pdf
  11. Needleman, S. B. and Wunsch, C. D . (1970). "A general method applicable to the search for similarities in the amino acid sequence of two proteins". Journal of Molecular Biology. 1970, 443–453.
  12. Smith, T. F. and M. S. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, 1981, 147: 195-197.
  13. Bergroth, L. , Hakonen, H. and Raita, T. "A Survey of Longest Common Subsequence Algorithms". SPIRE (IEEE Computer Society), 2000,39–48.
Index Terms

Computer Science
Information Sciences

Keywords

computational biology algorithm Expressed Sequence Tag heuristic algorithms BLAST FASTA dynamic algorithms Needleman-Wunsch Smith-Waterman LCS