Automated Tool to Generate Parallel CUDA Code from a Serial C Code

Akhil Jindal; Nikhil Jindal; Divyashikha Sethia and

Call for Paper

November Edition

IJCA solicits high quality original research papers for the upcoming November edition of the journal. The last date of research paper submission is 20 October 2025

Submit your paper

Know more

The week's pick

Zero Trust Architecture Implementation in Enterprise Networks: Evaluating Effectiveness Against Cyber Threats

Stephen Kofi Dotse Samuel Yao Sebuabe Augustus Obeng Silas Asani Abudu Edna Awisie Pappoe

Random Articles

Reseach Article

Automated Tool to Generate Parallel CUDA Code from a Serial C Code

by Akhil Jindal, Nikhil Jindal, Divyashikha Sethia and

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 50 - Number 8

Year of Publication: 2012

Authors: Akhil Jindal, Nikhil Jindal, Divyashikha Sethia and

10.5120/7790-0891

Akhil Jindal, Nikhil Jindal, Divyashikha Sethia and . Automated Tool to Generate Parallel CUDA Code from a Serial C Code. International Journal of Computer Applications. 50, 8 ( July 2012), 15-21. DOI=10.5120/7790-0891

@article{ 10.5120/7790-0891,

author = { Akhil Jindal, Nikhil Jindal, Divyashikha Sethia and },

title = { Automated Tool to Generate Parallel CUDA Code from a Serial C Code },

journal = { International Journal of Computer Applications },

issue_date = { July 2012 },

volume = { 50 },

number = { 8 },

month = { July },

year = { 2012 },

issn = { 0975-8887 },

pages = { 15-21 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume50/number8/7790-0891/ },

doi = { 10.5120/7790-0891 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:47:45.563642+05:30

%A Akhil Jindal

%A Nikhil Jindal

%A Divyashikha Sethia and

%T Automated Tool to Generate Parallel CUDA Code from a Serial C Code

%J International Journal of Computer Applications

%@ 0975-8887

%V 50

%N 8

%P 15-21

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

With the introduction of GPGPUs, parallel programming has become simple and affordable. APIs such as NVIDIA's CUDA have attracted many programmers to port their applications to GPGPUs. But writing CUDA codes still remains a challenging task. Moreover, the vast repositories of legacy serial C codes, which are still in wide use in the industry, are unable to take any advantage of this extra computing power available. Lot of attempts have thus been made at developing auto-parallelization techniques to convert a serial C code to a corresponding parallel CUDA code. Some parallelizes, allow programmers to add "hints" to their serial programs, while another approach has been to build an interactive system between programmers and parallelizing tools/compilers. But none of these are really automatic techniques, since the programmer is fully involved in the process. In this paper, we present an automatic parallelization tool that completely relieves the programmer of any involvement in the parallelization process. Preliminary results with a basic set of usual C codes show that the tool is able to provide a significant speedup of ~10 times.

References

NVIDIA, NVIDIA CUDA Compute Unified Device Architecture-Programming Guide, Version 3, 2010.
Stone, J. E. , Gohara, D. , Guochun Shi, "OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems", Computing in Science and Engineering, Vol. 12, Issue 3, pp. 66-73, May 2010
E. Alerstam, T. Svensson and S. Andersson-Engels, "Parallel computing with graphics processing units for high speed Monte Carlo simulation of photon migration" , J. Biomedical Optics 13, 060504 (2008).
Larsen E. S. , Mcallister D. , "Fast matrix multiplies using graphics hardware", Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, Nov. 2001, pp. 55.
Vladimir Glavtchev, Pinar Muyan-Ozcelik, Jeffrey M. Ota, John D. Owens, "Feature-Based Speed Limit Sign Detection Using a Graphics Processing Unit", IEEE Intelligent Vehicles, 2011.
Woetzel J. , Koch R. , "Multi-camera realtime depth estimation with discontinuity handling on PC graphics hardware", Proceedings of the 17th International Conference on Pattern Recognition (Aug. 2004), pp. 741–744.
Rumpf M. , Strzodka R. , "Level set segmentation in graphics hardware", Proceedings of the IEEE International Conference on Image Processing (ICIP '01), Oct. 2001, vol. 3, pp. 1103–1106.
Purcell T. J. , Buck I. , Mark W. R. , Hanrahan P. , "Ray tracing on programmable graphics hardware", ACM Transactions on Graphics 21, 3 (July 2002), pp 703–712.
Knott D. , Pai D. K. , "CInDeR: Collision and interference detection in real-time using graphics hardware", Proceedings of the 2003 Conference on Graphics Interface, June 2003, pp. 73–80.
Svetlin A. Manavski, "Cuda compatible GPU as an efficient hardware accelerator for AES cryptography" Proc. IEEE International Conference on Signal Processing and Communication, ICSPC 2007, (Dubai, United Arab Emirates), November 2007, pp. 65-68.
T. D. Han and T. S. Abdelrahman, "hiCUDA: High-Level GPGPU Programming", IEEE Transactions on Parallel and Distributed Systems, Jan. 2011, vol. 22, no. 1, pp. 78-90.
David B. Loveman, "High Performance Fortran", IEEE Parallel & Distributed Technology: Systems & Technology, February 1993, v. 1 n. 1, pp 25-42.
Leonardo Dagum and Ramesh Menon, "OpenMP: An industry-standard API for shared-memory programming", IEEE Computational Science and Engineering, 5(1):46–55, January–March 1998.
VectorFabrics. vfAnalyst: Analyze your sequential C code to create an optimized parallel implementation. http://www. vectorfabrics. com/.
M. Hall, J. Anderson, S. Amarasinghe, B. Murphy, S. -W. Liao, E. Bugnion, and M. Lam, "Maximizing multiprocessor performance with the SUIF compiler", IEEE Comput. 29, 12, Dec. 1996, pp 84–89.
W. Blume, R. Doallo, R. Eigenmann, J. Grout, J. Hoeflinger, T. Lawrence, J. Lee, D. Padua, Y. Paek, B. Pottenger, L. Rauchwerger, and P. Tu. "Advanced Program Restructuring for High-Performance Computers with Polaris", IEEE Computer, December 1996, Vol. 29, No. 12, pages 78- 82.
Johnson, S. P. , Evans, E. , Jin, H. , Ierotheou, C. S. , "The ParaWise Expert Assistant—Widening accessibility to efficient and scalable tool generated OpenMP code", WOMPAT, pp. 67–82 (2004).
T. D. Han, "Directive-Based General-Purpose GPU Programming", master's thesis, Univ. of Toronto, Sept. 2009.
Elsa: The Elkhound-based C/C++ Parser. http://www. scottmcpeak. com/elkhound/sources/elsa/

Index Terms

Computer Science

Information Sciences

Keywords

Auto parallelization parallelization C CUDA hiCUDA GPU