CFP last date
20 May 2024
Reseach Article

Implementation and Evaluation of mpiBLAST-PIO on HPC Cluster

by Nisha Dhankher, O P Gupta
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 97 - Number 21
Year of Publication: 2014
Authors: Nisha Dhankher, O P Gupta
10.5120/17131-7735

Nisha Dhankher, O P Gupta . Implementation and Evaluation of mpiBLAST-PIO on HPC Cluster. International Journal of Computer Applications. 97, 21 ( July 2014), 18-23. DOI=10.5120/17131-7735

@article{ 10.5120/17131-7735,
author = { Nisha Dhankher, O P Gupta },
title = { Implementation and Evaluation of mpiBLAST-PIO on HPC Cluster },
journal = { International Journal of Computer Applications },
issue_date = { July 2014 },
volume = { 97 },
number = { 21 },
month = { July },
year = { 2014 },
issn = { 0975-8887 },
pages = { 18-23 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume97/number21/17131-7735/ },
doi = { 10.5120/17131-7735 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:24:43.365074+05:30
%A Nisha Dhankher
%A O P Gupta
%T Implementation and Evaluation of mpiBLAST-PIO on HPC Cluster
%J International Journal of Computer Applications
%@ 0975-8887
%V 97
%N 21
%P 18-23
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Due to exponential growth in the size of genomic databases, traditional techniques of sequence search proved to be slow. To address the above problem, an open source and parallel version of BLAST called mpiBLAST was developed by the programmers. In mpiBLAST, the master process distributes the database fragments among worker nodes to compute the sequence search in parallel. As merging and writing of the results is done sequentially by the master process, it would create performance bottleneck with increasing number of processors and varying database sizes. To handle this high non-search overhead, mpiBLAST-PIO was introduced. This paper describes the optimized and extended version of mpiBLAST called mpiBLAST-PIO. The goal of this research was to investigate the performance of parallel implementation of BLAST in comparison to sequential NCBI-BLAST by measuring Speedup and efficiency on HPC platform using Infiniband. Different options of mpiBLAST-PIO were activated that helped in understanding the optimal parameters for achieving highly scalable parallel BLAST implementation. The results found that parallel-writing of the results, can evolve as an efficient solution when high-performance parallel file system is available.

References
  1. Borovska P, Gancheva V and Markov S 2011. Parallel performance evaluation of sequence nucleotide alignment on the Supercomputer BlueGene/P. In Proceedings of the European Computing Conference, Wisconsin, USA. Pp 462-467.
  2. Borovska P, Nakov O, Gancheva V and Georgiev I 2010. Parallel genome sequence searching on supercomputer BlueGene/P. In Proceedings of ECS'10/ ECCTD'10/ ECCOM'10/ ECCS'10. Pp: 27-31.
  3. Correa J C and Silva G P 2011. Parallel BLAST analysis and performance evaluation. In Proceedings of the BICOB-2011, University of Houston, New Orleans, Louisiana, USA
  4. Darling A E, Carey L and Feng W 2003. The Design, implementation and evaluation of mpiBLAST. ClusterWorld Conference & Expo and the 4th International Conference on Linux Clusters: The HPC Revolution 2003.
  5. Feng W 2003. Green Destiny + mpiBLAST = Bioinformagic. 10th InternationalConference on Parallel Computing: Bioinformatics Symposium.
  6. Gardner M K, Feng W, Archuleta J, Lin H and Ma X 2006. Parallel genomic sequence searching on an Ad-Hoc grid: Experiences, Lessons Learned and Implications. SC'06 Proceedings of ACM/IEEE conference on supercomputing, Tampa, Florida, USA.
  7. Kent W J 2002. "Blat- The BLAST-Like Alignment Tool", Genome Research. Volume no. 12 Pp: 656-664.
  8. Lin H, Ma X, Chandramohan P, Geist A and Samatova N 2005. Efficient data access for Parallel BLAST. 19th IEEE International Parallel and Distributed Processing Symposium, April 3-8, 2005 in Denver, Colorado. Volume no. 01 Pp: 72-82.
  9. Lin H, Ma X, Feng W and Samatova N F 2011. "Coordinating computation and I/O in massively parallel sequence search". In IEEE Transactions on Parallel & Distributed SystemsVolume no. 22 Pp: 529-543
  10. Mathog R D 2003. "Parallel BLAST on split databases", Oxford University Press. Volume no. 19 Pp: 1865-1866. Brown, L. D. , Hua, H. , and Gao, C. 2003. A widget framework for augmented interaction in SCAPE.
  11. Mulhem M A and Shaikh R A 2013. "Performance modelling of parallel BLAST using Intel and PGI compilers on an infiniband-based HPC cluster", International Journal of Bioinformatics Research and Applications, Volume no. 9, pp 534 (Abstr).
  12. Muralidhara B L 2013. "Parallel two master method to improve BLAST algorithm's performance", International Journal of Computer Applications, Volume no. 63 pp: 0975-8887.
  13. Pedretti K T, Braun R C, Casavant T L, Scheetz T E, Birkett C L and Roberts C A 2001. Parallelization of local BLAST service on workstation clusters. In Future Generation Computer Systems. Volume no. 17 pp : 745-754.
  14. Rangwala H, Lantz E, Musselman R, Pinnow K, Smith B and Wallenfelt B 2005. Massively Parallel BLAST for the Blue Gene/L. High Availability and Performance Computing Conference.
  15. Sait S M, Mulhem M A and Shaikh R A 2011. Evaluating BLAST runtime using NAS based high performance clusters. In Proceedings of the CIMSIM'11, Langkawi, Malaysia Pp: 51-56.
  16. Sosa C P, Thorsen O, Smith B, Jiang K, Lin H, Peters A and Feng W C 2007. Parallel genomic sequence search on a massively parallel system. CF'07,Ischia, Italy. Pp:59-68
  17. Sousa D X D, Lifschitz S and Valduriez P 2008. BLAST parallelization on partitioned databases with primary fragments. High Performance Computing for Computational Science- VECPAR 2008, Toulouse, France Volume no. 5336 pp: 544-554.
  18. Yang C T and Kuo Y L 2003. "Apply Parallel bioinformatics applications on Linux PC Clusters", Tunghai Science. Pp: 125-141.
  19. Zomaya A Y (ed) 2006. Parallel Computing For Bionformatics and Computational Biology, John Wiley & Sons Inc, New Jersey. Pp 221-226.
  20. mpiBLAST website, http://www. mpiblast. org
  21. National Centre for Bioinformatics website: http://www. ncbi. nlm. nih. gov.
Index Terms

Computer Science
Information Sciences

Keywords

mpiBLAST-PIO Parallel & Distributed Computing High Performance Computing Bioinformatics