CFP last date
20 May 2024
Reseach Article

Intelligent Thread-Specific Rename Register Allocation for Simultaneous Multi-Threading Processors Based on Cache Behavior

by An Do, Wei-Ming Lin
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 29
Year of Publication: 2023
Authors: An Do, Wei-Ming Lin
10.5120/ijca2023923037

An Do, Wei-Ming Lin . Intelligent Thread-Specific Rename Register Allocation for Simultaneous Multi-Threading Processors Based on Cache Behavior. International Journal of Computer Applications. 185, 29 ( Aug 2023), 1-9. DOI=10.5120/ijca2023923037

@article{ 10.5120/ijca2023923037,
author = { An Do, Wei-Ming Lin },
title = { Intelligent Thread-Specific Rename Register Allocation for Simultaneous Multi-Threading Processors Based on Cache Behavior },
journal = { International Journal of Computer Applications },
issue_date = { Aug 2023 },
volume = { 185 },
number = { 29 },
month = { Aug },
year = { 2023 },
issn = { 0975-8887 },
pages = { 1-9 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number29/32873-2023923037/ },
doi = { 10.5120/ijca2023923037 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:27:20.608072+05:30
%A An Do
%A Wei-Ming Lin
%T Intelligent Thread-Specific Rename Register Allocation for Simultaneous Multi-Threading Processors Based on Cache Behavior
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 29
%P 1-9
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Simultaneous Multi-Threading (SMT) processors allow multiple threads to share resources, such as execution units, caches, and pipelines, in the same processor to improve overall system throughput and utilization. The distribution of the physical register file can have a significant impact on the performance of the system. Hence, the register file is one of the most crucial shared resources. One or a few threads holding too many shared registers can obstruct the execution of other threads, thus hurting overall performance. In this paper, we develop an efficient register-file-sharing algorithm based on the number of L2 cache misses. To determine the relationship between L2 cache misses and rename register utilization, the analysis begins with running programs in a single-threaded environment. This relationship then becomes the foundation to develop an algorithm to optimize the use of shared registers. Simulation on M-sim [8] shows that the proposed algorithm increases the throughput by up to 63% compared to the default case while preserving execution fairness among threads.

References
  1. Shane Carroll and Wei-Ming Lin. Latency-aware write buffer resource control in multi-threaded cores. Int. J. Distrib. Parallel Syst.(IJDPS), 7, 2016.
  2. Francisco J Cazorla, Alex Ramirez, Mateo Valero, and Enrique Fern´andez. Dynamically controlled resource allocation in smt processors. In 37th International Symposium on Microarchitecture (MICRO-37’04), pages 171–182. IEEE, 2004.
  3. Hiroaki Hirata, Kozo Kimura, Satoshi Nagamine, Yoshiyuki Mochizuki, Akio Nishimura, Yoshimori Nakase, and Teiji Nishizawa. An elementary processor architecture with simultaneous instruction issuing from multiple threads. In Proceedings of the 19th annual international symposium on Computer architecture, pages 136–145, 1992.
  4. Sherifdeen Lawal, Yilin Zhang, and WM Lin. Prioritizing write buffer occupancy in simultaneous multi-threading processors. Journal of Emerging Trends in Computing and Information Sciences, 6(10):515–522, 2015.
  5. Jack L Lo, Sujay S Parekh, Susan J Eggers, Henry M Levy, and Dean M Tullsen. Software-directed register deallocation for simultaneous multithreaded processors. IEEE Transactions on Parallel and Distributed Systems, 10(9):922–933, 1999.
  6. Teresa Monreal, Antonio Gonz´alez, Mateo Valero, Jos´e Gonz´alez, and V´ıctor Vi˜nals. Dynamic register renaming through virtual-physical registers. Journal of Instruction Level Parallelism, 2:4–16, 2000.
  7. Tilak Kumar Develapura Nagaraju, Caleb Douglas,Wei-Ming Lin, and Eugene John. Effective Dispatching in Simultaneous Multithreading (SMT) Processors by Capping Per-thread Resource Utilization. PhD thesis, University of Texas at San Antonio, 2011.
  8. Joseph Sharkey, Dmitry Ponomarev, and Kanad Ghose. M-sim: a flexible, multithreaded architectural simulation environment. Techenical report, Department of Computer Science, State University of New York at Binghamton, 2005.
  9. Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. Automatically characterizing large scale program behavior. ACM SIGPLAN Notices, 37(10):45–57, 2002.
  10. SPEC. Standard performance evaluation corporation. https://www.spec.org.
  11. Dean M Tullsen, Susan J Eggers, Joel S Emer, Henry M Levy, Jack L Lo, and Rebecca L Stamm. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In Proceedings of the 23rd annual international symposium on Computer architecture, pages 191–202, 1996.
  12. Dean M Tullsen, Susan J Eggers, and Henry M Levy. Simultaneous multithreading: Maximizing on-chip parallelism. In Proceedings of the 22nd annual international symposium on Computer architecture, pages 392–403, 1995.
  13. Wenjun Wang and Wei-Ming Lin. Real-time physical register file allocation with neural networks for simultaneous multi-threading processors. International Journal of High Performance Systems Architecture, 8(3):146–158, 2018.
  14. Yilin Zhang, Marcus Hays, Wei-Ming Lin, and Eugene John. Autonomous control of issue queue utilization for simultaneous multi-threading processors. In Proceedings of the High Performance Computing Symposium, pages 1–8, 2014.
  15. Yilin Zhang and Wei-Ming Lin. Capping speculative traces to improve performance in simultaneous multi-threading cpus. In 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, pages 1555–1564. IEEE, 2013.
  16. Yilin Zhang andWei-Ming Lin. Efficient physical register file allocation in simultaneous multi-threading cpus. In 33rd IEEE International Performance Computing and Communications Conference (IPCCC 2014), Austin, Texas, 2014.
  17. Yilin Zhang and Wei-Ming Lin. Intelligent usage management of shared resources in simultaneous multi-threading processors. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), page 27. The Steering Committee of TheWorld Congress in Computer Science, Computer . . . , 2015.
Index Terms

Computer Science
Information Sciences

Keywords

Simultaneous Multi-threading Register Rename Register Capping L2 Cache Miss Resource Sharing