CFP last date
22 April 2024
Reseach Article

Enhancing Parallel Recursive Brute Force Algorithm for Motif Finding

by Marwa A. Radad, Nawal A. El-fishawy, Hossam M. Faheem
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 86 - Number 3
Year of Publication: 2014
Authors: Marwa A. Radad, Nawal A. El-fishawy, Hossam M. Faheem
10.5120/14965-3143

Marwa A. Radad, Nawal A. El-fishawy, Hossam M. Faheem . Enhancing Parallel Recursive Brute Force Algorithm for Motif Finding. International Journal of Computer Applications. 86, 3 ( January 2014), 15-22. DOI=10.5120/14965-3143

@article{ 10.5120/14965-3143,
author = { Marwa A. Radad, Nawal A. El-fishawy, Hossam M. Faheem },
title = { Enhancing Parallel Recursive Brute Force Algorithm for Motif Finding },
journal = { International Journal of Computer Applications },
issue_date = { January 2014 },
volume = { 86 },
number = { 3 },
month = { January },
year = { 2014 },
issn = { 0975-8887 },
pages = { 15-22 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume86/number3/14965-3143/ },
doi = { 10.5120/14965-3143 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:03:15.899150+05:30
%A Marwa A. Radad
%A Nawal A. El-fishawy
%A Hossam M. Faheem
%T Enhancing Parallel Recursive Brute Force Algorithm for Motif Finding
%J International Journal of Computer Applications
%@ 0975-8887
%V 86
%N 3
%P 15-22
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Motif search is a fundamental problem in bioinformatics. Its main application is locating transcription factor binding sites (TFBSs) in DNA sequences. Numerous algorithms have been proposed in the literature to solve this problem. The exact algorithms solve M(l,d) problem by reporting all l-length motifs M with at most d mutations. Recursive Brute Force (R-BF) algorithm is an exact algorithm that has solved M(l,d) problem in exponential time with d. Multicore implementations of R-BF have efficiently utilized computation resources of modern multicore architectures to achieve more advantageous operation than sequential one. In this paper, we explore an enhanced version of R-BF algorithm. The new algorithm is called R-BF2. R-BF2 enhances the running time of R-BF by memorizing more information about each node in search space. R-BF2 pays more than 40% memory space to achieve a speedup factor of 3. However, parallel implementations of R-BF2 keep the same scalability just like R-BF on multicore systems.

References
  1. Gopal, S. , Haake, A. , Jones, R. P. and Tymann, P. 2009. Bioinformatics, A Computing Perspective, McGrawHill, International Edition.
  2. Pvzner, P. and Sze, S. H. 2000. Combinatorial approaches to finding subtle signals in DNA sequences. In Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, 269–278.
  3. Bailey, T. L. , Williams, N. , Misleh, C. , and Li, W. W. 2006. "MEME: discovering and analyzing DNA and protein motifs", Nucleic Acid Research, Vol. 34, 369–373.
  4. Lawrence, C. E. , Altschul, S. F. , Boguski, M. S. , Liu, J. S. , Neuwald, A. F. , and Wootton, J. C. 1993. "Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment", Science, Vol. 262, 208–214.
  5. Roth, F. , Hughes, J. , Estep, P. , and Church, G. 1998. "Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole genome mRNA quantitation", Nature Biotechnology, Vol. 16, 939–945.
  6. Liu, X. , Brutlag, D. L. and Liu, J. S. 2001. BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. In Proceedings of Pacific Symposium on Biocomputing, 127-138.
  7. Shida, K. 2006. Hybrid Gibbs-Sampling algorithm for challenging motif discovery: GibbsDST. In Proceedings of the 17th International Conference on Genome Informatics, 3-13.
  8. Jones, N. C. and Pevzner, P. A. 2004. An Introduction to Bioinformatics Algorithms. The MIT Press.
  9. Rajasekaran, S. , Balla, S. and Huang, C-H. 2005. "Exact algorithms for planted motif challenge problems", APBC, 249-259.
  10. Rajasekaran, S. , Balla, S. , Huang, C-H, Thapar, V. , Gryk, M. , Maciejewski, M. and Schiller, M. 2005. "High-performance exact algorithms for motif search", Journal of Clinical Monitoring and Computing, Vol 19. 319–328.
  11. Rajasekaran, S. and Dinh, H. 2011. A speedup technique for (l, d) motif finding algorithms, BMC Research Notes, Vol. 4, No. 54, 1–7.
  12. Dinh, H. , Rajasekaran, S. , and Kundeti, V. 2011. "Pms5: an efficient exact algorithm for the (l,d) motif finding problem", BMC Bioinformatics, Vol. 12, No. 410.
  13. Bandyopadhyay, S. , Sahni, S. , and Rajasekaran, S. 2012. Pms6: A fast algorithm for motif discovery. In Second IEEE International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).
  14. Dinh, H. , Rajasekaran, S. , Davila, J. J. 2012. "qPMS7: A Fast Algorithm for Finding (l, d)-Motifs in DNA and Protein Sequences", PLoS ONE, Vol. 7, No. 7, e41425.
  15. Pavesi, G. , Mauri, G. and Pesole, G. 2001. "An algorithm for finding signals of unknown length in DNA sequences", ISMB (Supplement of Bioinformatics),Vol 17, No. 1, 207-214.
  16. Floratou, A. , Tata, S. and Patel, J. M. 2010. Efficient and accurate discovery of patterns in sequence datasets, In Proceedings of the IEEE 26th International Conference on Data Engineering – ICDE, 461-472.
  17. Eskin, E. and Pevzner, P. A. 2002. "Finding composite regulatory patterns in DNA sequences", Bioinformatics, Vol 18, No. 1, 354-363.
  18. Radad M. A. , El-Fishawy, N. A. , Faheem, H. M. 2013. "Implementation of Recursive Brute Force for Solving Motif Finding Problem on Multi-core", International Journal of Systems Biology and Biomedical Technology ( IJSBBT), 2013, in press.
  19. OpenMP: http://www. openmp. org ( Last visited 14-12-2013).
  20. MPI Forum: http://www. mpi-forum. org ( Last visited 14-12-2013)
  21. MPICH: http://www-unix. mcs. anl. gov/mpi/ mpich2 (Last visited 14-12-2013)
  22. Eadline, D. 2007. MPI on Multicore, an OpenMP Alternative, Linux Magazine, http://www. linux-mag. com/id/4608/(Last visited 14-12-2013).
  23. Mallón, D. A. , Taboada, G. L. , Teijeiro, Ca. , Touriño, J. , Fraguela, B. B. , Gómez, A. , Doallo, R. , Mouriño, J. C. 2009. Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures, In proceeding of the 16th Recent Advances in Parallel Virtual Machine and Message Passing Interface, PVM/MPI2009, Espoo, Finland, 174-184.
  24. Rouchka, E. C. and Hardin, C. T. 2007. "rMotifGen: random motif generator for DNA and protein sequences", BMC Bioinformatics, Vol. 8, 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Bioinformatics Motif finding Branch & Bound search Parallel programming OpenMP MPI