Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale and Atom Pineview-D Intel General Purpose Processors

Print
PDF
International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 52 - Number 11
Year of Publication: 2012
Authors:
Izzeldin Ibrahim Mohd
Chay Chin Fatt
Muhammad N. Marsono
10.5120/8246-1757

Izzeldin Ibrahim Mohd, Chay Chin Fatt and Muhammad N Marsono. Article: 32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale and Atom Pineview-D Intel General Purpose Processors. International Journal of Computer Applications 52(11):17-23, August 2012. Full text available. BibTeX

@article{key:article,
	author = {Izzeldin Ibrahim Mohd and Chay Chin Fatt and Muhammad N. Marsono},
	title = {Article: 32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale and Atom Pineview-D Intel General Purpose Processors},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {52},
	number = {11},
	pages = {17-23},
	month = {August},
	note = {Full text available}
}

Abstract

Nowadays mobile devices represent a significant portion of the market for embedded systems, and are continuously demanded in daily life. From the end-user perspective size, weight, features are the key quality criteria. These benchmarks criteria became the usual design constraints in the embedded systems design process and put a high impact on the power consumption. This paper survey and explore different low power design techniques for FPGA and processors. We compare, evaluate, and analyze, the power and energy consumption in three different designs namely, Altera FPGA Cyclone II which has a systolic array matrix multiplication implemented, i5 Clarkdale, and Atom Pineview-D Intel general purpose processors, which multiply two nxn 32-bit matrices and produce a 64-bit matrix as an output. We concluded that FPGA is a more power and energy efficient on low matrix size. However, general purpose processor performance is close to FPGA on larger matrix size as the larger cache size in general purpose processor help in reducing latency. We also concluded that the performance of FPGA can be improved in terms of latency if more systolic array processing elements are implemented in parallel to allow more concurrency.

References

  • R. Scrofano, S. Choi, and V. K. Prasanna, "Energy Efficiency of FPGAs and Programmable Processors for Matrix Multiplication," in Proc. of IEEE Intl. Conf. on Field Programmable Technology, pp. 422-425, 2002. e
  • S. Choi, V. K. Prasanna, and J. Jang, "Minimizing energy dissipation of matrix multiplication kernel on Virtex-II," in Proc. of SPIE, Vol. 4867, pp. 98-106, 2002.
  • J. Jang, S. Choi, and V. K. Prasanna, "Energy efficient matrix multiplication on FPGAs," in Proc. of 12th Intl. Conf. on Field Programmable Logic and Applications, pp. 534-544, 2002.
  • J. Jang, S. Choi, and V. K. Prasanna, "Area and Time Efficient Implementations of Matrix Multiplication on FPGAs," in Proc. of IEEE Intl. Conf. on Field Programmable Technology, pp. 93-100, 2002.
  • H. T. Kung and C. E. Leiserson, "Systolic arrays for (VLSI)," Introduction to VLSI Systems, 1980.
  • V. K. P. Kumar and Y. Tsai, "On synthesizing optimal family of linear systolic arrays for matrix multiplication," IEEE Trans Comput. , vol. 40, no. 6, pp. 770–774, 1991.
  • Lamoureux J and Luk, W,"An overview of Low-Power Techniques for Field-Programmable Gate Arrays. ", in Adaptive Hardware and System. AHS'08. NASA/ESA, 2008.
  • Sutter, G. , Boemo, E. "Experiments in low power FPGA design", Lat. Am. Appl. Res. , vol. 37, no. 1, pp. 99-104, 2007.
  • Dave. N, Fleming. K, Myron King, Pellauer. M, Vijayaraghavan, M. "Hardware Acceleration of Matrix Multiplication on a Xilinx FPGA", in Formal Methods and Models for Codesign, 2007. MEMOCODE 2007. 5th IEEE/ACM International Conference, May 30 2007-June 2 2007.
  • Aslan. S, Desmouliers. C. , Oruklu. E and Saniie. J. "An Efficient Hardware Design Tool for Scalable Matrix Multiplication", in Circuits and Systems (MWSCAS), 2010 53rd IEEE International Midwest Symposium, pp1262-1265, 2010
  • H. T. Kung. "Why Systolic Architecture", in IEEE computer, pp37-46. 1982
  • Ju-Wook Jang, Seonil B. Choi and Viktor K. Prasanna. " Energy and Time Efficient Matrix Multiplication on FPGAs", in IEEE transactions on very large scale integration (VLSI) system vol 13, NO 11, November 2005.
  • Qasim, S. M, Abbasi S. A, Almashary B. " A proposed FPGA-based parallel architecture for matrix multiplication", in circuits and systems, 2008, APCCAS 2008, IEEE Asia Pacific Conference, pp1763-1766, 2008.
  • Syed M, Qasim, Ahmed A. Telba, Abdulhameed Y. AlMazroo. "FPGA Design and Implementation of Matrix Multiplier Architectures for Image and Signal Processing Applications", in IJCSNS International Journal of Computer Science and Network Security, VOL 10. NO2, Feb 2010.
  • AHM Shapri and N. A. Z Rahman. "Performance Analysis of Two-Dimensional Systolic Array Matrix Multiplication with Orthogonal Interconnections", in International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(3): 1090-1100, 2001
  • Jonathan Break. "Systolic Array and their Application", inhttp://www. cs. ucf. edu/courses/cot4810/fall04/. . . /Systolic_Arrays. ppt
  • Altera Inc, Cyclone II Device Handbook, Volume 1, available at www. altera. com
  • Altera Inc, DE2 Development and Education Board Use Manual available at www. altera. com
  • Altera Inc DE2 Development and Education Board Schematic, available at www. altera. com