High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion

D. N. Sonawane; M. S. Sutaone

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2025

Submit your paper

Know more

The week's pick

Disease Detection in Tea Leaves: A Hybrid Model Using YOLOv7 and DCNN

Md Zahidul Kabir Md Sourav Hossen Sumiya Kaisar Keya

Random Articles

Kannada Word Sense Disambiguation for Machine Translation

November

2011

Performance Analysis of WDM-RoF System with CO-OFDM for Long Distance Communication

November

2015

Efficient Multipath Routing Protocol based on Path Survivability Factor

October

2014

Article:Economic Thermal Power Dispatch with Emission Constraint and Valve Point Effect Loading Using Improved Tabu Search Algorithm

July

2010

Reseach Article

High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion

by D. N. Sonawane, M. S. Sutaone

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 35 - Number 8

Year of Publication: 2011

Authors: D. N. Sonawane, M. S. Sutaone

10.5120/4419-6140

D. N. Sonawane, M. S. Sutaone . High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion. International Journal of Computer Applications. 35, 8 ( December 2011), 10-15. DOI=10.5120/4419-6140

@article{ 10.5120/4419-6140,

author = { D. N. Sonawane, M. S. Sutaone },

title = { High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion },

journal = { International Journal of Computer Applications },

issue_date = { December 2011 },

volume = { 35 },

number = { 8 },

month = { December },

year = { 2011 },

issn = { 0975-8887 },

pages = { 10-15 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume35/number8/4419-6140/ },

doi = { 10.5120/4419-6140 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:21:25.749636+05:30

%A D. N. Sonawane

%A M. S. Sutaone

%T High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion

%J International Journal of Computer Applications

%@ 0975-8887

%V 35

%N 8

%P 10-15

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Cholesky factorization is the computationally most expensive step in numerically solving positive definite systems. Due to inherently recursive computation process and associated floating point division and square root operations in Cholesky factorization, it is very difficult to obtain acceleration by exploiting parallelism on FPGA’s. To solve this problem, approach suggests iterative architecture with parallelly fetching the matrix elements using customized Diagonal Processing Elements (DPU), Non Diagonal Processing Elements (NDPU) and Triangular Processing Elements (TPU) as computational processing units. The use of LNS approach using LUT technique for floating point square root and division arithmetic eventually improves resource and clock cycle utilization. Scheme is implemented using Xilinx Virtex-4 FPGA and achieves 0.032µs clock latency and obtained a throughput of 31.25Mupdates/s operating at 125 MHz for 4x4 matrix inversion problem.

References

Antonio Roldao and George A. Constantinides, “A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation for Dense Matrices”, ACM Transaction on Reconfigurable Technoogy Systems, Vol. 3 (1), Jan. 2010.
J. Sun, et.al, “High Performance Mixed-Precision Linear Solver for FPGAs,” IEEE Transaction on Computers, vol. 57, (12) Dec. 2008
D. Yang, et. al, "Performance Comparison of Cholesky Decomposition on GPUs and FPGAs," Symposium on Application Accelerators in High Performance Computing (SAAHPC), knoxville, TN, 2010.
D. Yang, G. D. Peterson, and H. Li, "High Performance Reconfigurable Computing for Cholesky Decomposition," Symposium on Application Accelerators in High Performance Computing (SAAHPC), UIUC, July, 2009.
D. Yang, et.al, "An FPGA Implementation for Solving Least Square Problem," The 17th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, California, April. 2009.
Antonio Roldao Lopes and George A. Constantinides, “A High Throughput FPGA-Based Floating Point Conjugate GradientImplementation”, Proceedings of the 4th international workshop onReconfigurable Computing: Architectures, Tools and Applications, Springer-Verlag, Berlin, Heidelberg,2008, pp 75-86
O. Maslennikow, et.al, “Parallel Implementation of Cholesky LLT-Algorithm in FPGA-Based Processor,” Proceedings of the 7th international conference on Parallel processing and applied mathematics Springer- Verlag Berlin Heidelberg, , June-2007pp. 137-147
Maslennikow,et.al.,"Implementation of Cholesky LLT-Decomposition Algorithm in FPGA-Based Rational Fraction Parallel Processor," Mixed Design of Integrated Circuits and Systems, 14th International conference on mixed design (MIXDES 2007),Ciechocinek, POLAND, 21-23 June 2007, pp.287-292
Burian, A. et.al , "A fixed-point implementation of matrix inversion using Cholesky decomposition," Micro-NanoMechatronics and Human Science, Proceedings of the 46th IEEE International Midwest Symposium on In Circuits and Systems, 2003. MWSCAS , Dec. 2003, pp. 1431- 1434
S. Haridas, “FPGA Implementation of a Cholesky Algorithm for a Shared-Memory Multiprocessor Architecture,”, A masters thesis, submitted to the faculty of New Jersey Institute of Technology, May 2003.
Garcia et al. “LNS Architectures for Embedded Model PredictiveControl Processors”In Proceedings of International Conference onCompilers, Architecture, and Synthesis for Embedded Systems,CASES 2004, Washington DC, USA, Sep. 2004
A book on “VLSI Digital Signal Processing System Design and Implementation”, by K. K. Parhi, Wiley, 1999.

Index Terms

Computer Science

Information Sciences

Keywords

Choleskey Factorization FPGA’s Iterative Architetcure Virtex-4