CFP last date
20 May 2024
Reseach Article

A Time Efficient Approach for Quick Splitting and Finding Maximum Matches Sequences in Bio-sequence Analysis

by Mohammad Hasan, Muhammad Ibrahim Khan, Fatema Tuz Zohra, and Abu Saleh Musa Miah, and Ashrafun Zannat, and Md. Al Hasan, and Md. Mamunur Rashid
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 23
Year of Publication: 2020
Authors: Mohammad Hasan, Muhammad Ibrahim Khan, Fatema Tuz Zohra, and Abu Saleh Musa Miah, and Ashrafun Zannat, and Md. Al Hasan, and Md. Mamunur Rashid
10.5120/ijca2020920252

Mohammad Hasan, Muhammad Ibrahim Khan, Fatema Tuz Zohra, and Abu Saleh Musa Miah, and Ashrafun Zannat, and Md. Al Hasan, and Md. Mamunur Rashid . A Time Efficient Approach for Quick Splitting and Finding Maximum Matches Sequences in Bio-sequence Analysis. International Journal of Computer Applications. 176, 23 ( May 2020), 42-48. DOI=10.5120/ijca2020920252

@article{ 10.5120/ijca2020920252,
author = { Mohammad Hasan, Muhammad Ibrahim Khan, Fatema Tuz Zohra, and Abu Saleh Musa Miah, and Ashrafun Zannat, and Md. Al Hasan, and Md. Mamunur Rashid },
title = { A Time Efficient Approach for Quick Splitting and Finding Maximum Matches Sequences in Bio-sequence Analysis },
journal = { International Journal of Computer Applications },
issue_date = { May 2020 },
volume = { 176 },
number = { 23 },
month = { May },
year = { 2020 },
issn = { 0975-8887 },
pages = { 42-48 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number23/31342-2020920252/ },
doi = { 10.5120/ijca2020920252 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:43:20.466828+05:30
%A Mohammad Hasan
%A Muhammad Ibrahim Khan
%A Fatema Tuz Zohra
%A and Abu Saleh Musa Miah
%A and Ashrafun Zannat
%A and Md. Al Hasan
%A and Md. Mamunur Rashid
%T A Time Efficient Approach for Quick Splitting and Finding Maximum Matches Sequences in Bio-sequence Analysis
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 23
%P 42-48
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

DNA sequence analysis & comparison computation is a vital task in terms of memory & time which is used huge size of data set for biological research. Perfectly aligned sequence find out the matching point or mismatches between two sequences. Our proposed algorithm is composed of two major part. The first part is Fast Splitting(FS), a “Recursive technique” based algorithm which divides the source sequence in appropriate and exact length according to the preference of target sequence. Second part is Fast_Maximum Matches Subsequence Finder(Fast_MMSS). It builds the specialized successor table according to the identical characters of two strings (TLSS & Target Sequence). Then using some special pruning condition, we get the final MMSS. In previous work, dynamic programming and some sorts of Brute force techniques are applied which are faster in terms of time but requires huge memory, while our proposed algorithm maintains the ‘Time and Space’ tradeoff.

References
  1. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman. "Basic local alignment search tool," J. Mol. Biol., vol. 215, pp. 403-410, 1990.
  2. S. B. Needleman, C. D. Wunsch. "A general method applicable to the search for similarities in the amino acid sequence of two proteins," J.Mol Biol., vol .48, pp. 443-453, 1970.
  3. T. F. Smith, M. S. Waterman. "Comparison of bio-sequences," Adv. Appl. Math.2, PP. 482-489, 19981.
  4. C. Notredame, D. G. Higgins, J. Heringa. "T-Coffee: A novel method for that and accurate multiple sequence alignment," Journal of molecular biology, vol. 302, no.1, pp. 205-217, 2000.
  5. A. B. Do, M. S. Mahabhashyam, M. Brudno, S. Batzoglou. "ProbCons: Probabil-istic consistency-based multiple sequence alignment," Genome research, vol. 15. no. 2, pp. 330-340, 2005.
  6. L. A. Newberg. "Memory-efficient dynamic programming backtrace and pair-wise local sequence alignment," Bioinformatics, vol. 24. no. 16, pp- 1772-1778, 2008.
  7. M.I. Khan, M.S. Kamal. ”RSAM: an integrated algorithm for local sequence alignment.” Arch Sci vol. 5 pp. 395–412, 2013.
  8. W. Liu , L. Chen ,” A fast longest common subsequences algorithm for biosequence alignment.” Computer And Computing Technologies In Agriculture, volume I. CCTA 2007, The International Federation for Information Processing, vol 258. Springer, Boston, MA
  9. D. J. Lipman, W. R. Pearson. "Improved tools for biological sequence comparison," Proc. Natl Acad. Sci., vol. 85, pp. 2444-2448, 1998.
  10. T. Watanabe, A. Takeda, K. Mise, T. Okuno, T. Suzuki, N. Minami, H. Imai. "Stage-specific expression of microRNAs during Xenopus ,development," FEBS Left., vol. 579, no. 318, 2005.
  11. S. Griffiths, A. Bateman, M. Marshall, A. Khanna, and S. R. Eddy. "Rfam: An RNA Family Database," Nucleic Acids Research, vol. 31, no. 1, pp. 439-441, 2003.
  12. D. Lee, K. Han. "Prediction of RNA Pseudoknots: Comparative Study of Genetic Algorithms," Genome Informatics, vol. 13, pp. 414-415, 2003.
  13. F. Pais, P. Ruy, G. Oliveira and R. S. Coimbra. "Assessing the efficiency of multiple sequence alignment programs," Algorithms for Molecular Biology, vol. 9, no. 4, pp. 1-8, 2014.
  14. D. Kletiogiannisl, P. Kalnisl and V. B. Bajic2. "Comparing Memory-Efficient Gcnomc Assemblers on Stand-Alone and Cloud Infrastructures," PLoS ONE, vol. 8, no. 9, pp. 1-11, 2013.
  15. S. Kamal, & M.I. Khan, “An integrated algorithm for local sequence alignment”, Netw Model Anal Health Inform Bioinforma (2014) 3: 68. https://doi.org/10.1007/s13721-014-0068-8.A. Khedher, I.Jraidi, & C. Frasson, “Local Sequence
  16. Alignment for Scan Path Similarity Assessment”, International Journal of Information and Education Technology, vol. 8, no. 7, July 2018.
  17. D. Nath, J. Kurmi & D. N. Shukla,” A Revised Algorithm to find Longest Common Subsequence”, International Journal for Research in Applied Science & Engineering Technology (IJRASET), vol. 6, Issue IV, April, 2018.
  18. Z. Yang, R. Zhu and L. Zhang, "The improvement and implementation on the algorithm for local alignment of pairs of DNA sequences," IEEE IMCEC, Xi'an, 2016, pp. 1316-1320. doi: 10.1109/IMCEC.2016.7867426.
Index Terms

Computer Science
Information Sciences

Keywords

Fast Splitting (FS) Maximum Matches Subsequences (MMSS) Target Length Splitted Sequence(TLSS) Successor Table.