CFP last date
20 August 2025
Call for Paper
September Edition
IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper
Know more
Random Articles
Reseach Article

Performance Analysis of Raspberry Pi 4B (8GB) Beowulf Cluster: HPCG Benchmarking

by Dimitrios Papakyriakou, Ioannis S. Barbounakis
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 25
Year of Publication: 2025
Authors: Dimitrios Papakyriakou, Ioannis S. Barbounakis
10.5120/ijca2025925449

Dimitrios Papakyriakou, Ioannis S. Barbounakis . Performance Analysis of Raspberry Pi 4B (8GB) Beowulf Cluster: HPCG Benchmarking. International Journal of Computer Applications. 187, 25 ( Jul 2025), 49-64. DOI=10.5120/ijca2025925449

@article{ 10.5120/ijca2025925449,
author = { Dimitrios Papakyriakou, Ioannis S. Barbounakis },
title = { Performance Analysis of Raspberry Pi 4B (8GB) Beowulf Cluster: HPCG Benchmarking },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2025 },
volume = { 187 },
number = { 25 },
month = { Jul },
year = { 2025 },
issn = { 0975-8887 },
pages = { 49-64 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number25/performance-analysis-of-raspberry-pi-4b-8gb-beowulf-cluster-hpcg-benchmarking/ },
doi = { 10.5120/ijca2025925449 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-07-31T02:40:02+05:30
%A Dimitrios Papakyriakou
%A Ioannis S. Barbounakis
%T Performance Analysis of Raspberry Pi 4B (8GB) Beowulf Cluster: HPCG Benchmarking
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 25
%P 49-64
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The High-Performance Conjugate Gradient (HPCG) benchmark has emerged as a complementary metric to the High Performance LINPACK (HPL) [1], aiming to evaluate real-world high-performance computing (HPC) workloads that emphasize memory access patterns, cache behavior, and sparse matrix operations. Unlike HPL, which reflects peak floating-point capability, HPCG simulates practical scientific computations involving iterative solvers and irregular memory access, offering a more realistic performance indicator. This study investigates the implementation and analysis of the HPCG benchmark on a 24-node Beowulf cluster built with Raspberry Pi 4B devices, each equipped with 8GB LPDDR4 RAM and ARM Cortex-A72 processors. Both strong scaling (fixed problem size with increasing nodes) and weak scaling (proportional increase in problem size and nodes) methodologies were applied to assess system performance across various configurations. Metrics such as median execution time, floating-point throughput (GFLOP/s), and memory bandwidth (GB/s) were collected and analyzed. The results reveal that HPCG performance on this ARM-based cluster is primarily constrained by memory bandwidth saturation, lack of hardware-level floating-point acceleration, and network communication bottlenecks. Strong scaling experiments show minimal performance gains beyond 4–8 nodes, while weak scaling maintains computational stability up to moderate cluster sizes. Notably, the absence of measurable MPI communication overhead (ExchangeHalo time) underscores the limited halo data exchange under small subdomain decomposition and short runtimes. This study highlights the limitations and potential of energy-efficient, low-cost single-board clusters for realistic HPC workloads. The findings provide a methodological basis for benchmarking sparse solvers on ARM systems and inform future efforts in optimizing parallelism, memory access, and interconnect efficiency in edge computing, education, and embedded HPC environments.

References
  1. Dimitrios Papakyriakou, Ioannis S. Barbounakis. High Performance Linpack (HPL) Benchmark on Raspberry Pi 4B (8GB) Beowulf Cluster. International Journal of Computer Applications. 185, 25 (Jul 2023), 11-19. DOI=10.5120/ijca2023923005
  2. Dimitrios Papakyriakou, Ioannis S. Barbounakis. Performance Analysis of Raspberry Pi 4B (8GB) Beowulf Cluster: STREAM Benchmarking. International Journal of Computer Applications. 186, 78 (Apr 2025), 41-55. DOI=10.5120/ijca2025924687
  3. Raspberry Pi 3+ Model B. [Online]. Available: https://www.raspberrypi.com/products/raspberry-pi-3-model-b-plus/
  4. Raspberry Pi 4 Model B. [Online]. Available: raspberrypi.com/products/raspberry-pi-4-model-b/.
  5. Raspberry Pi 4 Model B specifications. [Online]. Available: https://magpi.raspberrypi.com/articles/raspberry-pi-4-specs-benchmarks
  6. HPCG Benchmark. [Online]. Available: https://www.hpcg-benchmark.org/
  7. Jack Dongarra, Michael A Heroux, Piotr Luszczek "High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems," The International Journal of High-Performance Computing Applications, SAGE, Volume: 30 issue: 1, 3-10, August 17, 2015
  8. Jack Dongarra, Michael A. Heroux "Toward a New Metric for Ranking High Performance Computing Systems," Sandia National Laboratories Technical Report, SAND2013-4744, June, 2013
  9. Dongarra, J., Heroux, M. A., & Luszczek, P. (2016). High-Performance Conjugate Gradient (HPCG) Benchmark. University of Tennessee and Sandia National Laboratories. Retrieved from https://www.hpcg-benchmark.org
Index Terms

Computer Science
Information Sciences

Keywords

Raspberry Pi 4 Beowulf cluster Cluster Message Passing Interface (MPI) MPICH Memory Performance Low-cost Clusters Parallel Computing ARM Architecture HPCG Benchmark